nlp-with-transformers / notebooks

Jupyter notebooks for the Natural Language Processing with Transformers book
https://transformersbook.com/
Apache License 2.0
3.7k stars 1.13k forks source link

fatal: repository 'https://huggingface.co/fcastell/distilbert-base-uncased-finetuned-emotion/' not found #109

Open fabianocastello opened 1 year ago

fabianocastello commented 1 year ago

Information

The question or comment is about chapter:

Question or comment

I am trying to run the notebook in Colab. In cell:

trainer = Trainer(model=model, args=training_args,
                  compute_metrics=compute_metrics,
                  train_dataset=emotions_encoded["train"],
                  eval_dataset=emotions_encoded["validation"],
                  tokenizer=tokenizer)

I got an error stating the 'https://huggingface.co/fcastell/distilbert-base-uncased-finetuned-emotion/' was not found. Could be something relate to git lfs but I try to "git clone" manually and the root cause seems do be that the repository http is not valid.

What am I missing? Thanks for help.

This is the entire error log:

Cloning https://huggingface.co/fcastell/distilbert-base-uncased-finetuned-emotion into local empty directory.
WARNING:huggingface_hub.repository:Cloning https://huggingface.co/fcastell/distilbert-base-uncased-finetuned-emotion into local empty directory.
---------------------------------------------------------------------------
CalledProcessError                        Traceback (most recent call last)
[/usr/local/lib/python3.10/dist-packages/huggingface_hub/repository.py](https://localhost:8080/#) in clone_from(self, repo_url, token)
    668                         env.update({"GIT_LFS_SKIP_SMUDGE": "1"})
--> 669 
    670                     run_subprocess(

8 frames
[/usr/local/lib/python3.10/dist-packages/huggingface_hub/utils/_subprocess.py](https://localhost:8080/#) in run_subprocess(command, folder, check, **kwargs)
     82 
---> 83     return subprocess.run(
     84         command,

[/usr/lib/python3.10/subprocess.py](https://localhost:8080/#) in run(input, capture_output, timeout, check, *popenargs, **kwargs)
    525         if check and retcode:
--> 526             raise CalledProcessError(retcode, process.args,
    527                                      output=stdout, stderr=stderr)

CalledProcessError: Command '['git', 'lfs', 'clone', 'https://user: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@huggingface.co/fcastell/distilbert-base-uncased-finetuned-emotion', '.']' returned non-zero exit status 2.

During handling of the above exception, another exception occurred:

OSError                                   Traceback (most recent call last)
[<ipython-input-69-3c0a11c6afca>](https://localhost:8080/#) in <cell line: 3>()
      1 from transformers import Trainer
      2 
----> 3 trainer = Trainer(model=model, args=training_args,
      4                   compute_metrics=compute_metrics,
      5                   train_dataset=emotions_encoded["train"],

[/usr/local/lib/python3.10/dist-packages/transformers/trainer.py](https://localhost:8080/#) in __init__(self, model, args, data_collator, train_dataset, eval_dataset, tokenizer, model_init, compute_metrics, callbacks, optimizers)
    404         # Create clone of distant repo and output directory if needed
    405         if self.args.push_to_hub:
--> 406             self.init_git_repo()
    407             # In case of pull, we need to make sure every process has the latest.
    408             if is_torch_tpu_available():

[/usr/local/lib/python3.10/dist-packages/transformers/trainer.py](https://localhost:8080/#) in init_git_repo(self)
   2649 
   2650         try:
-> 2651             self.repo = Repository(
   2652                 self.args.output_dir,
   2653                 clone_from=repo_name,

[/usr/local/lib/python3.10/dist-packages/huggingface_hub/utils/_validators.py](https://localhost:8080/#) in _inner_fn(*args, **kwargs)
    116             kwargs = smoothly_deprecate_use_auth_token(fn_name=fn.__name__, has_token=has_token, kwargs=kwargs)
    117 
--> 118         return fn(*args, **kwargs)
    119 
    120     return _inner_fn  # type: ignore

[/usr/local/lib/python3.10/dist-packages/huggingface_hub/repository.py](https://localhost:8080/#) in __init__(self, local_dir, clone_from, repo_type, token, git_user, git_email, revision, skip_lfs_files, client)
    514 
    515         if clone_from is not None:
--> 516             self.clone_from(repo_url=clone_from)
    517         else:
    518             if is_git_repo(self.local_dir):

[/usr/local/lib/python3.10/dist-packages/huggingface_hub/utils/_validators.py](https://localhost:8080/#) in _inner_fn(*args, **kwargs)
    116             kwargs = smoothly_deprecate_use_auth_token(fn_name=fn.__name__, has_token=has_token, kwargs=kwargs)
    117 
--> 118         return fn(*args, **kwargs)
    119 
    120     return _inner_fn  # type: ignore

[/usr/local/lib/python3.10/dist-packages/huggingface_hub/repository.py](https://localhost:8080/#) in clone_from(self, repo_url, token)
    707                     raise EnvironmentError(error_msg)
    708 
--> 709         except subprocess.CalledProcessError as exc:
    710             raise EnvironmentError(exc.stderr)
    711 

OSError: WARNING: 'git lfs clone' is deprecated and will not be updated
          with new flags from 'git clone'

'git clone' has been updated in upstream Git to have comparable
speeds to 'git lfs clone'.
Cloning into '.'...
remote: Repository not found
fatal: repository 'https://huggingface.co/fcastell/distilbert-base-uncased-finetuned-emotion/' not found
Error(s) during clone:
git clone failed: exit status 128
kevinidea commented 11 months ago

I am getting the same exact error.

Attempted solution but did not work: -Created write access token on HuggingFace website -Login successfully both on Jupyter Notebook and in terminal with huggingface-cli login command

Have anyone resolved this issue?

kevinidea commented 11 months ago

I finally found a fix: push_to_hub=False instead of True and it seems to train and save the model locally without the git error

training_args = TrainingArguments(output_dir=model_name, num_train_epochs=2, learning_rate=2e-5, per_device_train_batch_size=batch_size, per_device_eval_batch_size=batch_size, weight_decay=0.01, evaluation_strategy="epoch", disable_tqdm=False, logging_steps=logging_steps, push_to_hub=False, log_level="error")

kevinidea commented 11 months ago

Another better fix, this solution will allow you to upload to the HuggingFace ML

-Make sure your parent directory is NOT a Git, if you clone the book repo, it is an initiated Git repo. Therefore, you have to delete .git folder manually in the transformer book repo -Ensure that the dir at output_dir inside TrainingArguments is an empty directory -Install the latest transformers pip install transformers==4.31.0

mannjaro commented 10 months ago

This error simply means that the repository that should be cloned onto your account was not found. To resolve this, log in to HuggingFace, create a new repository by clicking on "+ New Model", and the issue should be resolved. image