artidoro / qlora

QLoRA: Efficient Finetuning of Quantized LLMs
https://arxiv.org/abs/2305.14314
MIT License
10.09k stars 823 forks source link

OverflowError: out of range integral type conversion attempted while running python qlora.py #18

Open amdnsr opened 1 year ago

amdnsr commented 1 year ago

python qlora.py --model_name_or_path decapoda-research/llama-13b-hf

(I have updated the tokenizer_config.json and config.json as per the various discussions here tokenizer_class: LlamaTokenizer and architectures: LlamaForCausalLM)

==================================================================================

adding LoRA modules... trainable params: 125173760.0 || all params: 6922327040 || trainable: 1.8082612866554193 loaded model

Using pad_token, but it is not set yet. Traceback (most recent call last): File "qlora.py", line 758, in train() File "qlora.py", line 620, in train "unk_token": tokenizer.convert_ids_to_tokens(model.config.pad_token_id), File "/home/envs/qlora_env/lib/python3.8/site-packages/transformers/tokenization_utils_fast.py", line 307, in convert_ids_to_tokens return self._tokenizer.id_to_token(ids) OverflowError: out of range integral type conversion attempted

amdnsr commented 1 year ago

Running on Tesla V100 32GB GPU

MaticsL commented 1 year ago

same issue

update

Change model.config.pad_token_id to 0 should fix this problem but may harm to training.

        tokenizer.add_special_tokens(
            {
                "eos_token": tokenizer.convert_ids_to_tokens(model.config.eos_token_id),
                "bos_token": tokenizer.convert_ids_to_tokens(model.config.bos_token_id),
                "unk_token": tokenizer.convert_ids_to_tokens(0),
            }
        )
LIO-H-ZEN commented 1 year ago

same issue

Qubitium commented 1 year ago

Please check if this pr fixes your issue. The pr was designed to fix something else but will also bypass token conversion if tokenizer already contains the special tokens.

https://github.com/artidoro/qlora/pull/20

ricksun2023 commented 1 year ago

same issue

atillabasaran commented 1 year ago

Please check if this pr fixes your issue. The pr was designed to fix something else but will also bypass token conversion if tokenizer already contains the special tokens.

20

this is solve my issue but I am getting maximum recursion depth error now.

File "/home/atilla/miniconda3/envs/qlora/lib/python3.9/site-packages/transformers/tokenization_utils_fast.py", line 257, in _convert_token_to_id_with_added_voc     return self.unk_token_id   File "/home/atilla/miniconda3/envs/qlora/lib/python3.9/site-packages/transformers/tokenization_utils_base.py", line 1142, in unk_token_id     return self.convert_tokens_to_ids(self.unk_token)   File "/home/atilla/miniconda3/envs/qlora/lib/python3.9/site-packages/transformers/tokenization_utils_fast.py", line 250, in convert_tokens_to_ids     return self._convert_token_to_id_with_added_voc(tokens)   File "/home/atilla/miniconda3/envs/qlora/lib/python3.9/site-packages/transformers/tokenization_utils_fast.py", line 257, in _convert_token_to_id_with_added_voc     return self.unk_token_id   File "/home/atilla/miniconda3/envs/qlora/lib/python3.9/site-packages/transformers/tokenization_utils_base.py", line 1142, in unk_token_id     return self.convert_tokens_to_ids(self.unk_token)   File "/home/atilla/miniconda3/envs/qlora/lib/python3.9/site-packages/transformers/tokenization_utils_fast.py", line 250, in convert_tokens_to_ids     return self._convert_token_to_id_with_added_voc(tokens)   File "/home/atilla/miniconda3/envs/qlora/lib/python3.9/site-packages/transformers/tokenization_utils_fast.py", line 257, in _convert_token_to_id_with_added_voc     return self.unk_token_id   File "/home/atilla/miniconda3/envs/qlora/lib/python3.9/site-packages/transformers/tokenization_utils_base.py", line 1142, in unk_token_id     return self.convert_tokens_to_ids(self.unk_token)   File "/home/atilla/miniconda3/envs/qlora/lib/python3.9/site-packages/transformers/tokenization_utils_fast.py", line 250, in convert_tokens_to_ids     return self._convert_token_to_id_with_added_voc(tokens)   File "/home/atilla/miniconda3/envs/qlora/lib/python3.9/site-packages/transformers/tokenization_utils_fast.py", line 257, in _convert_token_to_id_with_added_voc     return self.unk_token_id   File "/home/atilla/miniconda3/envs/qlora/lib/python3.9/site-packages/transformers/tokenization_utils_base.py", line 1142, in unk_token_id     return self.convert_tokens_to_ids(self.unk_token) RecursionError: maximum recursion depth exceeded

LIO-H-ZEN commented 1 year ago

Please check if this pr fixes your issue. The pr was designed to fix something else but will also bypass token conversion if tokenizer already contains the special tokens.

20

this is solve my issue but I am getting maximum recursion depth error now.

File "/home/atilla/miniconda3/envs/qlora/lib/python3.9/site-packages/transformers/tokenization_utils_fast.py", line 257, in _convert_token_to_id_with_added_voc     return self.unk_token_id   File "/home/atilla/miniconda3/envs/qlora/lib/python3.9/site-packages/transformers/tokenization_utils_base.py", line 1142, in unk_token_id     return self.convert_tokens_to_ids(self.unk_token)   File "/home/atilla/miniconda3/envs/qlora/lib/python3.9/site-packages/transformers/tokenization_utils_fast.py", line 250, in convert_tokens_to_ids     return self._convert_token_to_id_with_added_voc(tokens)   File "/home/atilla/miniconda3/envs/qlora/lib/python3.9/site-packages/transformers/tokenization_utils_fast.py", line 257, in _convert_token_to_id_with_added_voc     return self.unk_token_id   File "/home/atilla/miniconda3/envs/qlora/lib/python3.9/site-packages/transformers/tokenization_utils_base.py", line 1142, in unk_token_id     return self.convert_tokens_to_ids(self.unk_token)   File "/home/atilla/miniconda3/envs/qlora/lib/python3.9/site-packages/transformers/tokenization_utils_fast.py", line 250, in convert_tokens_to_ids     return self._convert_token_to_id_with_added_voc(tokens)   File "/home/atilla/miniconda3/envs/qlora/lib/python3.9/site-packages/transformers/tokenization_utils_fast.py", line 257, in _convert_token_to_id_with_added_voc     return self.unk_token_id   File "/home/atilla/miniconda3/envs/qlora/lib/python3.9/site-packages/transformers/tokenization_utils_base.py", line 1142, in unk_token_id     return self.convert_tokens_to_ids(self.unk_token)   File "/home/atilla/miniconda3/envs/qlora/lib/python3.9/site-packages/transformers/tokenization_utils_fast.py", line 250, in convert_tokens_to_ids     return self._convert_token_to_id_with_added_voc(tokens)   File "/home/atilla/miniconda3/envs/qlora/lib/python3.9/site-packages/transformers/tokenization_utils_fast.py", line 257, in _convert_token_to_id_with_added_voc     return self.unk_token_id   File "/home/atilla/miniconda3/envs/qlora/lib/python3.9/site-packages/transformers/tokenization_utils_base.py", line 1142, in unk_token_id     return self.convert_tokens_to_ids(self.unk_token) RecursionError: maximum recursion depth exceeded

same...

ghtaro commented 1 year ago

Hi, I changed to huggyllama/llama-7b and applied the chanige #20. I avoided the above errors and now below*

Traceback (most recent call last):
  File "/Workspace/Repos/toshiya.imoto@japan-d2.com/qlora/qlora.py", line 853, in <module>
    train()
  File "/Workspace/Repos/toshiya.imoto@japan-d2.com/qlora/qlora.py", line 824, in train
    metrics = trainer.evaluate(metric_key_prefix="eval")
  File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-7966afd1-6600-45a5-a135-50b716d0345e/lib/python3.10/site-packages/transformers/trainer_seq2seq.py", line 159, in evaluate
    return super().evaluate(eval_dataset, ignore_keys=ignore_keys, metric_key_prefix=metric_key_prefix)
  File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-7966afd1-6600-45a5-a135-50b716d0345e/lib/python3.10/site-packages/transformers/trainer.py", line 3108, in evaluate
    self.control = self.callback_handler.on_evaluate(self.args, self.state, self.control, output.metrics)
  File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-7966afd1-6600-45a5-a135-50b716d0345e/lib/python3.10/site-packages/transformers/trainer_callback.py", line 379, in on_evaluate
    return self.call_event("on_evaluate", args, state, control, metrics=metrics)
  File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-7966afd1-6600-45a5-a135-50b716d0345e/lib/python3.10/site-packages/transformers/trainer_callback.py", line 397, in call_event
    result = getattr(callback, event)(
  File "/Workspace/Repos/toshiya.imoto@japan-d2.com/qlora/qlora.py", line 751, in on_evaluate
    refs += [abcd_idx.index(label) for label in labels.tolist()]
  File "/Workspace/Repos/toshiya.imoto@japan-d2.com/qlora/qlora.py", line 751, in <listcomp>
    refs += [abcd_idx.index(label) for label in labels.tolist()]
ValueError: 29879 is not in list

I found that:

abcd_idx:  [319, 350, 315, 360]

labels tensor([  319, 29879,   350, 29879,   319, 29879,   315, 29879,   315, 29879,
          360, 29879,   350, 29879,   360, 29879], device='cuda:0')

Can anyone have an idea how to sort this out?

atillabasaran commented 1 year ago

same issue

update

Change model.config.pad_token_id to 0 should fix this problem but may harm to training.

        tokenizer.add_special_tokens(
            {
                "eos_token": tokenizer.convert_ids_to_tokens(model.config.eos_token_id),
                "bos_token": tokenizer.convert_ids_to_tokens(model.config.bos_token_id),
                "unk_token": tokenizer.convert_ids_to_tokens(0),
            }
        )

I wonder what exactly does this change provide?

mofanv commented 1 year ago

same issue

hemangjoshi37a commented 1 year ago

@amdnsr,

Based on the error message you shared, it appears that there is an "OverflowError: out of range integral type conversion attempted" when converting token IDs during tokenization. To address this issue, we recommend the following solution:

  1. Update your code as follows:
from transformers import LlamaTokenizer

tokenizer = LlamaTokenizer.from_pretrained("decapoda-research/llama-13b-hf")
tokenizer.add_special_tokens({"pad_token": "[PAD]"})

# Rest of your code...

By using the LlamaTokenizer from the transformers library and adding the [PAD] token as a special token, you can resolve the "out of range integral type conversion" error.

Best regards, @hemangjoshi37a

saxenarohit commented 1 year ago

Setting pad_token_id in the model config worked for me.

For example for vicuna model.config.pad_token_id = tokenizer.eos_token_id