Closed abhishektcs1 closed 1 year ago
Hi @abhishektcs1, thanks for reporting this issue!
Could you provide information about the running environment: run transformers-cli env
in the terminal and copy-paste the output?
Hi @abhishektcs1, thanks for reporting this issue!
Could you provide information about the running environment: run
transformers-cli env
in the terminal and copy-paste the output?
2023-05-13 05:36:18.558293: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
WARNING:tensorflow:From /usr/local/lib/python3.10/dist-packages/transformers/commands/env.py:63: is_gpu_available (from tensorflow.python.framework.test_util) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.config.list_physical_devices('GPU')
instead.
2023-05-13 05:36:22.918424: W tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:47] Overriding orig_value setting because the TF_FORCE_GPU_ALLOW_GROWTH environment variable is set. Original config value was 0.
Copy-and-paste the text below in your GitHub issue and FILL OUT the two last points.
transformers
version: 4.29.1Please find the attachment having nvidia-smi output of google colab pro, I am using
@abhishektcs1 @sanyoggupta Could either of you also share a full traceback of the error encountered (the entire error message, from the first lines), preferably as a copy-paste of the text rather than a screenshot please?
@abhishektcs1 @sanyoggupta Could either of you also share a full traceback of the error encountered (the entire error message, from the first lines), preferably as a copy-paste of the text rather than a screenshot please?
Hey, I am getting a similar error when I try out my code This is the Traceback:
Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision.
Traceback (most recent call last):
File "/home/ksuresh6/DataChat_Project/model.py", line 20, in <module>
model = load_checkpoint_and_dispatch(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/data/hulab/ksuresh6/anaconda3/envs/datachat_env/lib/python3.11/site-packages/accelerate/big_modeling.py", line 479, in load_checkpoint_and_dispatch
load_checkpoint_in_model(
File "/data/hulab/ksuresh6/anaconda3/envs/datachat_env/lib/python3.11/site-packages/accelerate/utils/modeling.py", line 982, in load_checkpoint_in_model
raise ValueError(f"{param_name} doesn't have any device set.")
ValueError: decoder.transformer.h.7.attn.causal_mask doesn't have any device set.
(datachat_env) ksuresh6@AMD4RTX3090GPU14:~/DataChat_Project$ python3 model.py
Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision.
Traceback (most recent call last):
File "/home/ksuresh6/DataChat_Project/model.py", line 20, in <module>
model = load_checkpoint_and_dispatch(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/data/hulab/ksuresh6/anaconda3/envs/datachat_env/lib/python3.11/site-packages/accelerate/big_modeling.py", line 479, in load_checkpoint_and_dispatch
load_checkpoint_in_model(
File "/data/hulab/ksuresh6/anaconda3/envs/datachat_env/lib/python3.11/site-packages/accelerate/utils/modeling.py", line 982, in load_checkpoint_in_model
raise ValueError(f"{param_name} doesn't have any device set.")
ValueError: decoder.transformer.h.7.attn.causal_mask doesn't have any device set.
This is the code I am trying out:
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer
import torch
from transformers import AutoConfig
from accelerate import init_empty_weights
from accelerate import load_checkpoint_and_dispatch
checkpoint = "Salesforce/instructcodet5p-16b"
device = "cuda" # for GPU usage or "cpu" for CPU usage
model_path ='/home/ksuresh6/.cache/huggingface/hub/models--Salesforce--instructcodet5p-16b/snapshots/b5aaae8f54e8f13897e395fbc4c22567df0399ef'
tokenizer = AutoTokenizer.from_pretrained(model_path)
config = AutoConfig.from_pretrained(checkpoint,torch_dtype=torch.float16,low_cpu_mem_usage=True,trust_remote_code=True)
with init_empty_weights():
model = AutoModelForSeq2SeqLM.from_config(config, trust_remote_code=True,torch_dtype=torch.float16)
model.tie_weights()
model = load_checkpoint_and_dispatch(
model, model_path, device_map="auto"
)
inputs = tokenizer.encode("def print_hello():", return_tensors="pt").to(device)
outputs = model.generate(inputs, max_length=12)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
This is the output of transformers-cli env
:
Copy-and-paste the text below in your GitHub issue and FILL OUT the two last points.
- `transformers` version: 4.26.1
- Platform: Linux-5.19.0-41-generic-x86_64-with-glibc2.35
- Python version: 3.10.6
- Huggingface_hub version: 0.12.1
- PyTorch version (GPU?): 1.13.1+cu117 (True)
- Tensorflow version (GPU?): not installed (NA)
- Flax version (CPU?/GPU?/TPU?): not installed (NA)
- Jax version: not installed
- JaxLib version: not installed
- Using GPU in script?: Yes
- Using distributed or parallel set-up in script?: Yes
Any help is appreciated! Thanks in advance!
@abhishektcs1 @sanyoggupta Could either of you also share a full traceback of the error encountered (the entire error message, from the first lines), preferably as a copy-paste of the text rather than a screenshot please? Hii, i am facing the same issue, this is what i get after executing (! transformers-cli env)
please help me out with this problem. Thank You!
@younesbelkada could this be the same bug you fixed on NLLB here? I see the no_split_module_class is also the attention layer.
Hmm this sounds more like you are using the infer auto device map in an inappropriate way indeed. You should put "M2M100EncoderLayer"
and "M2M100DecoderLayer"
inside _no_split_modules
. Could you try again with these new values? Also can you share us a handy reproducible snippet? 🙏
Thank You i got it. @sgugger you have posted a great documentation on hugging face on "how to run these large model on our device".
Hmm this sounds more like you are using the infer auto device map in an inappropriate way indeed. You should put
"M2M100EncoderLayer"
and"M2M100DecoderLayer"
inside_no_split_modules
. Could you try again with these new values? Also can you share us a handy reproducible snippet? 🙏
please help me out what values should i pass in no_split_modules Thank You!
these are the model layers.
Hi @anujsahani01
Can you try to put GPTBigCodeBlock
in no split modules?
Hi @anujsahani01 Can you try to put
GPTBigCodeBlock
in no split modules?
Yes it worked. Thank You!
Hi @anujsahani01 Can you try to put
GPTBigCodeBlock
in no split modules?
Hey, was having one more doubt if please me with this. I am finetuning hugging face “HuggingFaceH4/starchat-alpha” model for making a data science text to code generating bot. This is the format of my dataset: train: Dataset({ features: [‘input_ids’, ‘labels’], num_rows: 5012 }) test: Dataset({ features: [‘input_ids’, ‘labels’], num_rows: 1325 }) }) and the structure of the dataset looks somewhat like this, which was explained in starcoder documentation, <|system|> Below is a dialogue between a human and an ANUJ_AI <|end|> <|user|> Minimum count of ind… so on <|end|> <|assistant|> def possible ( x , S , N ) : …so on <|end|>
I am loading the model on my colab in 8 bit format using :hugs:transformer BitsAndBytesConfig for saving memory, then loaded the model using a device map which was made using :hugs: transformers AutoConfig and the acclerate which divided my model amoung ‘gpu’, ‘cpu’ RAM and my ‘disk’.
Once the model and its checkpoints were downloaded successfully then i used transformers.Trainer to train the model on my custom dataset. my using the below code:
but i am always getting this error : Cannot copy out of meta tensor ; no data !
Your inputs will be highly appreciated. Thank You!
Hi @anujshani01 Thanks! Could you explain a bit more in details how you train the 8bit model? Are you sure you are using adapters leveraging PEFT library? Maybe if you can share the full snippet I can help you more on that! 💪
Hi @anujshani01 Thanks! Could you explain a bit more in details how you train the 8bit model? Are you sure you are using adapters leveraging PEFT library? Maybe if you can share the full snippet I can help you more on that! 💪
i have updates the colab notebook. https://drive.google.com/file/d/1-ccrx1Q5tkLUYtZBGi5lNZGjPMyr_X9U/view?usp=sharing
i am not using 8bit model now. i am using 🤗tool " accelerate " to initializing the model then using load_checkpoint_and_dispatch i am loading the model weights and all. But its giving me this error: ValueError: offload is not a folder containing a .index.json file.
i am not able to understant what exactly the error is. please have a look at the snip which show the offload folder and error
Please help we out with this error it would be a great help. Your inputs will be highly appreciated. Thank You!
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
System Info
I am getting following error while using accelerate for M2M100 on google colab pro. Following is the code snippet:
import torch
device=torch.device('cuda' if torch.cuda.is_available() else 'cpu') from transformers import AutoConfig, M2M100ForConditionalGeneration, M2M100Tokenizer, AutoModel
from accelerate import infer_auto_device_map, init_empty_weights
from transformers import AutoModel, M2M100Config
config = M2M100Config.from_pretrained("facebook/m2m100-12B-last-ckpt")
with init_empty_weights(): model = AutoModel.from_config(config)
device_map = infer_auto_device_map(model, no_split_module_classes=["M2M100Attention"])
checkpoint = "facebook/m2m100-12B-last-ckpt"
device_map["shared"] = "cpu" device_map["encoder"] = "cpu" device_map["decoder.embed_tokens"] = "cpu" device_map["decoder.embed_positions"] = "cpu" device_map["decoder.layers.0"] = "cpu" device_map["decoder.layers.1"] = "cpu" device_map["decoder.layers.2"] = "cpu" device_map["decoder.layers.3"] = "cpu"
model = M2M100ForConditionalGeneration.from_pretrained(checkpoint, device_map=device_map, offload_folder="offload", offload_state_dict = True)
Following are the env specs: Model Link: https://huggingface.co/facebook/m2m100-12B-last-ckpt Python Version: 3.10 GPU: A100 GPU: 40GB RAM: 83.5 GB CUDA version: 12.0
Who can help?
No response
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
import torch
device=torch.device('cuda' if torch.cuda.is_available() else 'cpu') from transformers import AutoConfig, M2M100ForConditionalGeneration, M2M100Tokenizer, AutoModel
from accelerate import infer_auto_device_map, init_empty_weights
from transformers import AutoModel, M2M100Config
config = M2M100Config.from_pretrained("facebook/m2m100-12B-last-ckpt")
with init_empty_weights(): model = AutoModel.from_config(config)
device_map = infer_auto_device_map(model, no_split_module_classes=["M2M100Attention"])
checkpoint = "facebook/m2m100-12B-last-ckpt"
device_map["shared"] = "cpu" device_map["encoder"] = "cpu" device_map["decoder.embed_tokens"] = "cpu" device_map["decoder.embed_positions"] = "cpu" device_map["decoder.layers.0"] = "cpu" device_map["decoder.layers.1"] = "cpu" device_map["decoder.layers.2"] = "cpu" device_map["decoder.layers.3"] = "cpu"
model = M2M100ForConditionalGeneration.from_pretrained(checkpoint, device_map=device_map, offload_folder="offload", offload_state_dict = True)
Expected behavior
Expecting the model to load properly and after the following code is to be used for translation:
hi_text='''La vie est comme une boîte de chocolat.'''
tokenizer = M2M100Tokenizer.from_pretrained("facebook/m2m100-12B-last-ckpt")
encoded_hi = tokenizer(hi_text, return_tensors="pt").to('cuda')
generated_tokens = model.generate(**encoded_hi, forced_bos_token_id=tokenizer.get_lang_id("en"))
print(tokenizer.batch_decode(generated_tokens, skip_special_tokens=True)[0])