Closed SchahinRohani closed 2 years ago
I'm not sure what you mean by "custom Huggingface Model". Would need to expand on that.
If you just want to use that model, you can run hgf"bert-base-german-cased"
The "custom" was confusing, so i changed it. We were able to run the german based bert models with hgf"some-german-bert-model". Now we are trying to run dbmdz/german-gpt2 by hgf"dbmdz/german-gpt2". but getting:
ERROR: KeyError: key :vocab not found Stacktrace: [1] getindex(h::Dict{String, Any}, key::Symbol) @ Base ./dict.jl:481 [2] load_tokenizer(::Val{:gpt2}, model_name::String; force_fast_tkr::Bool, possible_files::Vector{String}, config::Transformers.HuggingFace.HGFGPT2Config, tkr_cfg::Nothing, kw::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}) @ Transformers.HuggingFace ~/.julia/packages/Transformers/xjEIh/src/huggingface/implementation/gpt2/tokenizer.jl:50 [3] #load_tokenizer#103 @ ~/.julia/packages/Transformers/xjEIh/src/huggingface/tokenizer/tokenizer.jl:34 [inlined] [4] load_tokenizer(model_name::String; possible_files::Nothing, config::Transformers.HuggingFace.HGFGPT2Config, kw::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}) @ Transformers.HuggingFace ~/.julia/packages/Transformers/xjEIh/src/huggingface/tokenizer/tokenizer.jl:31 [5] load_hgf_pretrained(name::String) @ Transformers.HuggingFace ~/.julia/packages/Transformers/xjEIh/src/huggingface/HuggingFace.jl:56 [6] top-level scope @ REPL[6]:1 [7] top-level scope @ ~/.julia/packages/CUDA/DfvRa/src/initialization.jl:52
Shouldn't the GPT2 Models be interchangeable in the GPT2 Text Generation Example?
Shouldn't the GPT2 Models be interchangeable in the GPT2 Text Generation Example?
It should be. That's ~probably~ a bug. I'll fixed it this weekend.
Is this the only question of this issue?
I found out that the main branch of dbmdz/german-gpt2 doesnt have a vocab.json. An older version dbmdz/german-gpt2-faust has the vocab.json file and is also working, so it is not a problem with this lib.
Thank you for the fast response!
It take longer than I thought, but with Transformers v0.1.22 it should be able to load tokenizer from dbmdz/german-gpt2
.
Hello, we are new to the Transformers Lib and we want to get familiar with the Lib. Therefore we would like to have a bit of a guidance on how to implement a Huggingface Model. We are specially interested in integrating german language models like e.g bert-base-german-cased, deepset/gbert-large.
Any tips or input on this would be highly appreciated. Thank you