ValueError: Could not load model databricks/dolly-v2-12b with any of the following classes: (, ).

omnific9 commented 1 year ago

Running the model on a Windows computer with RTX 4090 python version == 3.10.10

code run:

import torch
from transformers import pipeline
generate_text = pipeline(model="databricks/dolly-v2-12b", torch_dtype=torch.bfloat16, trust_remote_code=True, device_map="auto")

Error

--> 266         raise ValueError(f"Could not load model {model} with any of the following classes: {class_tuple}.")
    268 framework = "tf" if model.__class__.__name__.startswith("TF") else "pt"
    269 return framework, model

ValueError: Could not load model databricks/dolly-v2-12b with any of the following classes: (, ).

pip freeze:

accelerate==0.18.0
aiohttp==3.8.4
aiosignal==1.3.1
asttokens==2.2.1
async-timeout==4.0.2
attrs==22.2.0
backcall==0.2.0
certifi==2022.12.7
charset-normalizer==3.1.0
colorama==0.4.6
comm==0.1.3
datasets==2.11.0
debugpy==1.6.7
decorator==5.1.1
dill==0.3.6
executing==1.2.0
filelock==3.11.0
frozenlist==1.3.3
fsspec==2023.4.0
huggingface-hub==0.13.4
idna==3.4
ipykernel==6.22.0
ipython==8.12.0
jedi==0.18.2
Jinja2==3.1.2
joblib==1.2.0
jupyter_client==8.2.0
jupyter_core==5.3.0
MarkupSafe==2.1.2
matplotlib-inline==0.1.6
mpmath==1.3.0
multidict==6.0.4
multiprocess==0.70.14
nest-asyncio==1.5.6
networkx==3.1
numpy==1.24.2
packaging==23.1
pandas==2.0.0
parso==0.8.3
pickleshare==0.7.5
Pillow==9.3.0
platformdirs==3.2.0
prompt-toolkit==3.0.38
psutil==5.9.4
pure-eval==0.2.2
pyarrow==11.0.0
Pygments==2.15.0
python-dateutil==2.8.2
pytz==2023.3
pywin32==306
PyYAML==6.0
pyzmq==25.0.2
regex==2023.3.23
requests==2.28.2
responses==0.18.0
scikit-learn==1.2.2
scipy==1.10.1
six==1.16.0
stack-data==0.6.2
sympy==1.11.1
threadpoolctl==3.1.0
tokenizers==0.13.3
torch==2.0.0+cu117
torchaudio==2.0.1+cu117
torchinfo==1.7.2
torchvision==0.15.1+cu117
tornado==6.2
tqdm==4.65.0
traitlets==5.9.0
transformers==4.25.1
typing_extensions==4.5.0
tzdata==2023.3
urllib3==1.26.15
wcwidth==0.2.6
xxhash==3.2.0
yarl==1.8.2

srowen commented 1 year ago

Can you try adding task="text-generation" to the pipeline()? That's worked for a few other people. @matthayes I think this task instruction-following thing might be causing this issue.

matthayes commented 1 year ago

I updated the config a couple days ago so that the task is named “text-generation”. Was this issue happening after that change?

https://huggingface.co/databricks/dolly-v2-12b/blob/main/config.json#L7

omnific9 commented 1 year ago

@srowen @matthayes still getting the same error

ValueError: Could not load model databricks/dolly-v2-12b with any of the following classes: (, ).

While running the code there was also a warning. Not sure if it means something here:

Tests\venv\lib\site-packages\tqdm\auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
  from .autonotebook import tqdm as notebook_tqdm
The argument `trust_remote_code` is to be used with Auto classes. It has no effect here and is ignored.

I downloaded the model just yesterday so I'll bet it's using the new config, but adding this text-generation didn't make a difference. I also see other people's error included the "AutoTokenizer" and stuff, but mine is just (, ), which is strange.

PS. just in case, I only have 32GB memory and wonder if that might be an issue since this model itself is 23GB big?

omnific9 commented 1 year ago

Tried databricks/dolly-v2-7b and this worked. Do you have any recommended memory size for each model? So my guess is in the RAM size. What do you think?

LuckyyySTA commented 1 year ago

@omnific9 I think your guess is correct. At first, I tried on V100-32gb and encountered the same error as you did. Then I switched to A100-40gb, and the pipeline was successfully initialized. However, it still ran out of memory during runtime. After using bfloat16, it ran successfully.

srowen commented 1 year ago

As a rule of thumb, because this model is stored in 16-bit, you will need 2 x num parameters bytes of mem, plus memory for the data and so on. 12B won't quite work on 24GB; should work on 32GB. 8bit requires half the memory. If you're accidentally using fp32, then memory requirements double.

omnific9 commented 1 year ago

@LuckyyySTA @srowen okay so it's still mainly the GPU not the regular DDR4 RAM? (I'm using my personal rig, not a cloud platform here). So if I upgrade my GPU further, I can still get away with like 32GB regular DDR4 RAM?

And this might be a dumb question as I'm new to a lot of this, if I were to finetune dolly-v2-7b, would it require a few times more memory in the GPU?

srowen commented 1 year ago

No, you need GPU RAM. Fine tuning is a pretty different question but broadly it will take more resources than inference.

databrickslabs / dolly

ValueError: Could not load model databricks/dolly-v2-12b with any of the following classes: (, ). #77