Open maywind23 opened 1 year ago
I am also facing the same issue. I am running on AWS g5.8xlarge Were you able to solve it?
Since LlamaTokenizer does not have value for tokenizer.pad_token_id, its value is None.
When this tokenizer is used to calculate the value of 'labels' it gives the error
TypeError: 'NoneType' object cannot be interpreted as an integer
I assigned the value 0 to tokenizer.pad_token_id and the above error was resolved.
But got another error Llama.forward() got an unexpected keyword argument 'labels'
To resolve this I replaced
model = AutoModel.from_pretrained( model_name, quantization_config=q_config, trust_remote_code=True )
with
model = AutoModelForCausalLM.from_pretrained( model_name, quantization_config=q_config, trust_remote_code=True, device_map = "auto" )
and it worked fine
TL;DR: add tokenizer.pad_token_id = 0
in your code
Main problem, that code in ChatGPT relies on pad_token_id, that meta-LLAMA model doesn't use. But if check a little closer into special_tokens_map.json
we can see next lines:
"pad_token": "<unk>",
"unk_token": {
"content": "<unk>",
"lstrip": false,
"normalized": true,
"rstrip": false,
"single_word": false
}
So as pad_token
we can use id of unk_token
which is equal to 0
. To solve the problem with None, initialise field pad_token_id
in tokenizer with value 0
: tokenizer.pad_token_id = 0
Thanks for your awesome project. I reproduced the FinGPT v3.1.2 (4-bit QLoRA). It does work with the default LLM model "chatglm2" on Colab, but it comes to a halt when I wanna get better results with Llama2.
I have changed the model as per your instructions, modifying _modelname = "THUDM/chatglm2-6b" to _modelname = "daryl149/llama-2-7b-chat-hf"
Then removed the device due to running error:
Changed the _targetmodules to llama:
target_modules = TRANSFORMERS_MODELS_TO_LORA_TARGET_MODULES_MAPPING['llama']
Unfortunately, the final step got a TypeError: 'NoneType' object cannot be interpreted as an integer
The detail error as follows:
6 frames