Closed omnific9 closed 1 year ago
Can you try adding task="text-generation"
to the pipeline()
? That's worked for a few other people. @matthayes I think this task instruction-following
thing might be causing this issue.
I updated the config a couple days ago so that the task is named “text-generation”. Was this issue happening after that change?
https://huggingface.co/databricks/dolly-v2-12b/blob/main/config.json#L7
@srowen @matthayes still getting the same error
ValueError: Could not load model databricks/dolly-v2-12b with any of the following classes: (, ).
While running the code there was also a warning. Not sure if it means something here:
Tests\venv\lib\site-packages\tqdm\auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
from .autonotebook import tqdm as notebook_tqdm
The argument `trust_remote_code` is to be used with Auto classes. It has no effect here and is ignored.
I downloaded the model just yesterday so I'll bet it's using the new config, but adding this text-generation didn't make a difference. I also see other people's error included the "AutoTokenizer" and stuff, but mine is just (, )
, which is strange.
PS. just in case, I only have 32GB memory and wonder if that might be an issue since this model itself is 23GB big?
Tried databricks/dolly-v2-7b and this worked. Do you have any recommended memory size for each model? So my guess is in the RAM size. What do you think?
@omnific9 I think your guess is correct. At first, I tried on V100-32gb and encountered the same error as you did. Then I switched to A100-40gb, and the pipeline was successfully initialized. However, it still ran out of memory during runtime. After using bfloat16, it ran successfully.
As a rule of thumb, because this model is stored in 16-bit, you will need 2 x num parameters bytes of mem, plus memory for the data and so on. 12B won't quite work on 24GB; should work on 32GB. 8bit requires half the memory. If you're accidentally using fp32, then memory requirements double.
@LuckyyySTA @srowen okay so it's still mainly the GPU not the regular DDR4 RAM? (I'm using my personal rig, not a cloud platform here). So if I upgrade my GPU further, I can still get away with like 32GB regular DDR4 RAM?
And this might be a dumb question as I'm new to a lot of this, if I were to finetune dolly-v2-7b, would it require a few times more memory in the GPU?
No, you need GPU RAM. Fine tuning is a pretty different question but broadly it will take more resources than inference.
Running the model on a Windows computer with RTX 4090 python version == 3.10.10
code run:
Error
pip freeze: