Closed ahmad-alismail closed 1 year ago
Hi @ahmad-alismail , thanks for reporting this.
~If you don't mind I'll transfer this issue to the transformers
repo and rename it. A breaking change has being introduced in huggingface_hub==0.12.0
. Since then, Repository
do not handle the repo creation if not existing on the Hub.~
~It seems that the Trainer
push_to_hub method do not handle the repo creation before calling Repository
which now fails. This has to be fixed here. In the meantime, you need to manually create the repo before using Trainer.push_to_hub
or downgrade to huggingface_hub==0.11.1
.~
~@sgugger @ydshieh I'll open a PR today to fix this.~
EDIT: I cannot transfer the issue to transformers
(most likely because I'm not a maintainer there) so if someone can do it :pray:
EDIT 2: it seems that the repo creation is already handled in the Trainer
class. @sgugger @ydshieh an idea why the create_repo
was not called?
@ahmad-alismail which version of transformers
do you have?
Yeah, looks the number line of the error in the PR description has a difference of > 1000. Better to know which transformers
version is used here.
Hi @Wauplin @ydshieh, thanks for your reply!
The version of transformers
is 4.11.3
@ahmad-alismail Could you try to update the transformers
package to latest release (4.26.1) and re-run your script?
Version 4.11.3 was released in September 2021 and is therefore outdated.
@Wauplin It's working perfectly! I truly appreciate your help – thank you so much!
Describe the bug
I'm trying to fine-tune XLM-RoBERTa model on a German corpus for NER task. To handle the training loop I'm using the 🤗 Transformers
Trainer
, so first I need to define the training attributes using theTrainingArguments
class:Write
role and define theTrainer
as follows:trainer = Trainer(model_init=model_init, # A function that instantiates the model to be used args=training_args, # Arguments to tweak for training data_collator=data_collator, compute_metrics=compute_metrics, train_dataset=panx_de_encoded["train"], eval_dataset=panx_de_encoded["validation"], tokenizer=xlmr_tokenizer)
It appears that the model repository with the name
xlm-roberta-base-finetuned-panx-de
does not currently exist. However, as described in the Hugging Face course, thepush_to_hub()
function (which should be used later in the notebook) handles both the creation of the repository and the push of the model and tokenizer files to that repository.Is there anything else that I might be missing?
System info