huggingface / autotrain-advanced

🤗 AutoTrain Advanced
https://huggingface.co/autotrain
Apache License 2.0
4.03k stars 493 forks source link

RuntimeError: operator torchvision::nms does not exist #692

Closed fre2mansur closed 2 months ago

fre2mansur commented 4 months ago

Prerequisites

Backend

Colab

Interface Used

UI

CLI Command

No response

UI Screenshots & Parameters

No response

Error Logs

INFO | 2024-06-28 13:00:09 | autotrain.cli.autotrain:main:58 - Using AutoTrain configuration: conf.yaml INFO | 2024-06-28 13:00:09 | autotrain.parser:post_init__:133 - Running task: lm_training INFO | 2024-06-28 13:00:09 | autotrain.parser:post_init:134 - Using backend: local INFO | 2024-06-28 13:00:09 | autotrain.parser:run:194 - {'model': 'meta-llama/llama-2-7b-chat-hf', 'project_name': 'devmansur', 'data_path': 'data/', 'train_split': 'train', 'valid_split': None, 'add_eos_token': True, 'block_size': 1024, 'model_max_length': 2048, 'padding': 'right', 'trainer': 'default', 'use_flash_attention_2': False, 'log': 'tensorboard', 'disable_gradient_checkpointing': False, 'logging_steps': -1, 'eval_strategy': 'epoch', 'save_total_limit': 1, 'auto_find_batch_size': False, 'mixed_precision': 'fp16', 'lr': 0.0002, 'epochs': 1, 'batch_size': 1, 'warmup_ratio': 0.1, 'gradient_accumulation': 4, 'optimizer': 'adamw_torch', 'scheduler': 'linear', 'weight_decay': 0.01, 'max_grad_norm': 1.0, 'seed': 42, 'chat_template': None, 'quantization': 'none', 'target_modules': 'all-linear', 'merge_adapter': False, 'peft': True, 'lora_r': 8, 'lora_alpha': 32, 'lora_dropout': 0.05, 'model_ref': None, 'dpo_beta': 0.1, 'max_prompt_length': 128, 'max_completion_length': None, 'prompt_text_column': None, 'text_column': 'text', 'rejected_text_column': None, 'push_to_hub': False, 'username': 'abc', 'token': '', 'unsloth': False} Saving the dataset (1/1 shards): 100% 4/4 [00:00<00:00, 422.24 examples/s] Saving the dataset (1/1 shards): 100% 4/4 [00:00<00:00, 1860.83 examples/s] INFO | 2024-06-28 13:00:09 | autotrain.backends.local:create:8 - Starting local training... INFO | 2024-06-28 13:00:09 | autotrain.commands:launch_command:400 - ['accelerate', 'launch', '--num_machines', '1', '--num_processes', '1', '--mixed_precision', 'fp16', '-m', 'autotrain.trainers.clm', '--training_config', 'devmansur/training_params.json'] INFO | 2024-06-28 13:00:09 | autotrain.commands:launch_command:401 - {'model': 'meta-llama/llama-2-7b-chat-hf', 'project_name': 'devmansur', 'data_path': 'devmansur/autotrain-data', 'train_split': 'train', 'valid_split': None, 'add_eos_token': True, 'block_size': 1024, 'model_max_length': 2048, 'padding': 'right', 'trainer': 'default', 'use_flash_attention_2': False, 'log': 'tensorboard', 'disable_gradient_checkpointing': False, 'logging_steps': -1, 'eval_strategy': 'epoch', 'save_total_limit': 1, 'auto_find_batch_size': False, 'mixed_precision': 'fp16', 'lr': 0.0002, 'epochs': 1, 'batch_size': 1, 'warmup_ratio': 0.1, 'gradient_accumulation': 4, 'optimizer': 'adamw_torch', 'scheduler': 'linear', 'weight_decay': 0.01, 'max_grad_norm': 1.0, 'seed': 42, 'chat_template': None, 'quantization': 'none', 'target_modules': 'all-linear', 'merge_adapter': False, 'peft': True, 'lora_r': 8, 'lora_alpha': 32, 'lora_dropout': 0.05, 'model_ref': None, 'dpo_beta': 0.1, 'max_prompt_length': 128, 'max_completion_length': None, 'prompt_text_column': 'autotrain_prompt', 'text_column': 'autotrain_text', 'rejected_text_column': 'autotrain_rejected_text', 'push_to_hub': False, 'username': 'abc', 'token': '', 'unsloth': False} Traceback (most recent call last): File "/usr/local/bin/accelerate", line 5, in from accelerate.commands.accelerate_cli import main File "/usr/local/lib/python3.10/dist-packages/accelerate/commands/accelerate_cli.py", line 19, in from accelerate.commands.estimate import estimate_command_parser File "/usr/local/lib/python3.10/dist-packages/accelerate/commands/estimate.py", line 34, in import timm File "/usr/local/lib/python3.10/dist-packages/timm/init.py", line 2, in from .layers import is_scriptable, is_exportable, set_scriptable, set_exportable File "/usr/local/lib/python3.10/dist-packages/timm/layers/init.py", line 8, in from .classifier import ClassifierHead, create_classifier, NormMlpClassifierHead File "/usr/local/lib/python3.10/dist-packages/timm/layers/classifier.py", line 15, in from .create_norm import get_norm_layer File "/usr/local/lib/python3.10/dist-packages/timm/layers/create_norm.py", line 14, in from torchvision.ops.misc import FrozenBatchNorm2d File "/usr/local/lib/python3.10/dist-packages/torchvision/init__.py", line 6, in from torchvision import _meta_registrations, datasets, io, models, ops, transforms, utils File "/usr/local/lib/python3.10/dist-packages/torchvision/_meta_registrations.py", line 164, in def meta_nms(dets, scores, iou_threshold): File "/usr/local/lib/python3.10/dist-packages/torch/library.py", line 440, in inner handle = entry.abstract_impl.register(func_to_register, source) File "/usr/local/lib/python3.10/dist-packages/torch/_library/abstract_impl.py", line 30, in register if torch._C._dispatch_has_kernel_for_dispatch_key(self.qualname, "Meta"): RuntimeError: operator torchvision::nms does not exist INFO | 2024-06-28 13:00:12 | autotrain.parser:run:199 - Job ID: 22375

Additional Information

No response

abhishekkrthakur commented 4 months ago

in colab, you need to update torch & torchvision. run this before everything:

pip install -U torch torchvision

fre2mansur commented 4 months ago

in colab, you need to update torch & torchvision. run this before everything:

pip install -U torch torchvision

Thank you, I tried, Same error response.

abhishekkrthakur commented 4 months ago

sorry, it needs to be done after.

fre2mansur commented 4 months ago

After all the block? Or the firstblock?

abhishekkrthakur commented 4 months ago

which colab notebook are you using? :)

fre2mansur commented 4 months ago

which colab notebook are you using? :)

https://colab.research.google.com/github/huggingface/autotrain-advanced/blob/main/colabs/AutoTrain_LLM.ipynb

abhishekkrthakur commented 4 months ago

change first cell to:

#@title 🤗 AutoTrain LLM
#@markdown In order to use this colab
#@markdown - upload train.csv to a folder named `data/`
#@markdown - train.csv must contain a `text` column
#@markdown - choose a project name if you wish
#@markdown - change model if you wish, you can use most of the text-generation models from Hugging Face Hub
#@markdown - add huggingface information (token) if you wish to push trained model to huggingface hub
#@markdown - update hyperparameters if you wish
#@markdown - click `Runtime > Run all` or run each cell individually
#@markdown - report issues / feature requests here: https://github.com/huggingface/autotrain-advanced/issues

import os
!pip install -U autotrain-advanced > install_logs.txt 2>&1
!pip install -U torch torchvision
!autotrain setup --colab > setup_logs.txt
from autotrain import __version__
print(f'AutoTrain version: {__version__}')
fre2mansur commented 4 months ago

change first cell to:

#@title 🤗 AutoTrain LLM
#@markdown In order to use this colab
#@markdown - upload train.csv to a folder named `data/`
#@markdown - train.csv must contain a `text` column
#@markdown - choose a project name if you wish
#@markdown - change model if you wish, you can use most of the text-generation models from Hugging Face Hub
#@markdown - add huggingface information (token) if you wish to push trained model to huggingface hub
#@markdown - update hyperparameters if you wish
#@markdown - click `Runtime > Run all` or run each cell individually
#@markdown - report issues / feature requests here: https://github.com/huggingface/autotrain-advanced/issues

import os
!pip install -U autotrain-advanced > install_logs.txt 2>&1
!pip install -U torch torchvision
!autotrain setup --colab > setup_logs.txt
from autotrain import __version__
print(f'AutoTrain version: {__version__}')

It works. Thank you.

NicklasMatzulla commented 4 months ago

change first cell to:

#@title 🤗 AutoTrain LLM
#@markdown In order to use this colab
#@markdown - upload train.csv to a folder named `data/`
#@markdown - train.csv must contain a `text` column
#@markdown - choose a project name if you wish
#@markdown - change model if you wish, you can use most of the text-generation models from Hugging Face Hub
#@markdown - add huggingface information (token) if you wish to push trained model to huggingface hub
#@markdown - update hyperparameters if you wish
#@markdown - click `Runtime > Run all` or run each cell individually
#@markdown - report issues / feature requests here: https://github.com/huggingface/autotrain-advanced/issues

import os
!pip install -U autotrain-advanced > install_logs.txt 2>&1
!pip install -U torch torchvision
!autotrain setup --colab > setup_logs.txt
from autotrain import __version__
print(f'AutoTrain version: {__version__}')

Hey, unfortunately, I have the same problem. I have already tried to install the package manually, but I still get the same error. Is the Colab still working for you or is the problem recurring?

Thanks for your help :)

fre2mansur commented 4 months ago

Still have the same issue.

On Sat, 6 Jul 2024, 05:37 Nicklas Matzulla, @.***> wrote:

change first cell to:

@. 🤗 AutoTrain LLM @. In order to use this colab @. - upload train.csv to a folder named data/ @. - train.csv must contain a text column @. - choose a project name if you wish @. - change model if you wish, you can use most of the text-generation models from Hugging Face Hub @. - add huggingface information (token) if you wish to push trained model to huggingface hub @. - update hyperparameters if you wish @. - click Runtime > Run all or run each cell individually @. - report issues / feature requests here: https://github.com/huggingface/autotrain-advanced/issues

import os !pip install -U autotrain-advanced > install_logs.txt 2>&1 !pip install -U torch torchvision !autotrain setup --colab > setup_logs.txt from autotrain import version print(f'AutoTrain version: {version}')

Hey, unfortunately, I have the same problem. I have already tried to install the package manually, but I still get the same error. Is the Colab still working for you or is the problem recurring?

Thanks for your help :)

— Reply to this email directly, view it on GitHub https://github.com/huggingface/autotrain-advanced/issues/692#issuecomment-2211523669, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACARJUJC5YB6DG3STKPIZFDZK4YMFAVCNFSM6AAAAABKB337QKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMJRGUZDGNRWHE . You are receiving this because you authored the thread.Message ID: @.***>

github-actions[bot] commented 3 months ago

This issue is stale because it has been open for 30 days with no activity.

github-actions[bot] commented 2 months ago

This issue was closed because it has been inactive for 20 days since being marked as stale.

frameartist commented 1 week ago
#@title 🤗 AutoTrain LLM
#@markdown In order to use this colab
#@markdown - upload train.csv to a folder named `data/`
#@markdown - train.csv must contain a `text` column
#@markdown - choose a project name if you wish
#@markdown - change model if you wish, you can use most of the text-generation models from Hugging Face Hub
#@markdown - add huggingface information (token) if you wish to push trained model to huggingface hub
#@markdown - update hyperparameters if you wish
#@markdown - click `Runtime > Run all` or run each cell individually
#@markdown - report issues / feature requests here: https://github.com/huggingface/autotrain-advanced/issues

import os
!pip install -U autotrain-advanced > install_logs.txt 2>&1
!autotrain setup --colab > setup_logs.txt
!pip install -U torch==2.4.0 torchvision
from autotrain import __version__
print(f'AutoTrain version: {__version__}')

This works for me, the pip install line needs to be after the autotrain setup line. And for some reasons newer versions of torch seems not working for me.