AI4Finance-Foundation / FinGPT

FinGPT: Open-Source Financial Large Language Models! Revolutionize 🔥 We release the trained model on HuggingFace.
https://ai4finance.org
MIT License
13.84k stars 1.92k forks source link

It did not work when I try to convert the default model "chatglm2" to "llama2" #64

Open maywind23 opened 1 year ago

maywind23 commented 1 year ago

Thanks for your awesome project. I reproduced the FinGPT v3.1.2 (4-bit QLoRA). It does work with the default LLM model "chatglm2" on Colab, but it comes to a halt when I wanna get better results with Llama2.

6 frames

in data_collator(features) 37 ids = ids + [tokenizer.pad_token_id] * (longest - ids_l) 38 _ids = torch.LongTensor(ids) ---> 39 labels_list.append(torch.LongTensor(labels)) 40 input_ids.append(_ids) 41 input_ids = torch.stack(input_ids) TypeError: 'NoneType' object cannot be interpreted as an integer ``` Could you please do me a favor resolving this issue? Looking forward to your reply! (Platform: A100 on Google Colab)
rajendrac3 commented 1 year ago

I am also facing the same issue. I am running on AWS g5.8xlarge Were you able to solve it?

rajendrac3 commented 1 year ago

Since LlamaTokenizer does not have value for tokenizer.pad_token_id, its value is None. When this tokenizer is used to calculate the value of 'labels' it gives the error TypeError: 'NoneType' object cannot be interpreted as an integer

I assigned the value 0 to tokenizer.pad_token_id and the above error was resolved. But got another error Llama.forward() got an unexpected keyword argument 'labels'

To resolve this I replaced

model = AutoModel.from_pretrained( model_name, quantization_config=q_config, trust_remote_code=True )

with model = AutoModelForCausalLM.from_pretrained( model_name, quantization_config=q_config, trust_remote_code=True, device_map = "auto" )

and it worked fine

IshchenkoRoman commented 7 months ago

TL;DR: add tokenizer.pad_token_id = 0 in your code

Main problem, that code in ChatGPT relies on pad_token_id, that meta-LLAMA model doesn't use. But if check a little closer into special_tokens_map.json we can see next lines:

  "pad_token": "<unk>",
  "unk_token": {
    "content": "<unk>",
    "lstrip": false,
    "normalized": true,
    "rstrip": false,
    "single_word": false
  }

So as pad_token we can use id of unk_token which is equal to 0. To solve the problem with None, initialise field pad_token_id in tokenizer with value 0: tokenizer.pad_token_id = 0