It did not work when I try to convert the default model "chatglm2" to "llama2"

maywind23 commented 1 year ago

Thanks for your awesome project. I reproduced the FinGPT v3.1.2 (4-bit QLoRA). It does work with the default LLM model "chatglm2" on Colab, but it comes to a halt when I wanna get better results with Llama2.

I have changed the model as per your instructions, modifying _modelname = "THUDM/chatglm2-6b" to _modelname = "daryl149/llama-2-7b-chat-hf"

Then removed the device due to running error:

model = AutoModel.from_pretrained(
    model_name,
    quantization_config=q_config,
    trust_remote_code=True,
    token = access_token,
    # device='cuda'
)

Changed the _targetmodules to llama: target_modules = TRANSFORMERS_MODELS_TO_LORA_TARGET_MODULES_MAPPING['llama']

Unfortunately, the final step got a TypeError: 'NoneType' object cannot be interpreted as an integer

writer = SummaryWriter()
trainer = ModifiedTrainer(
model=model,
args=training_args,             # Trainer args
train_dataset=dataset["train"], # Training set
eval_dataset=dataset["test"],   # Testing set
data_collator=data_collator,    # Data Collator
callbacks=[TensorBoardCallback(writer)],
)
trainer.train()
writer.close()
# save model
model.save_pretrained(training_args.output_dir)

The detail error as follows:


You are adding a <class 'transformers.integrations.TensorBoardCallback'> to the callbacks of this Trainer, but there is already one. The currentlist of callbacks is
:DefaultFlowCallback
TensorBoardCallback
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-27-d05cf508c134> in <cell line: 11>()
  9     callbacks=[TensorBoardCallback(writer)],
 10 )
---> 11 trainer.train()
 12 writer.close()
 13 # save model

6 frames

in data_collator(features) 37 ids = ids + [tokenizer.pad_token_id] * (longest - ids_l) 38 _ids = torch.LongTensor(ids) ---> 39 labels_list.append(torch.LongTensor(labels)) 40 input_ids.append(_ids) 41 input_ids = torch.stack(input_ids) TypeError: 'NoneType' object cannot be interpreted as an integer ``` Could you please do me a favor resolving this issue? Looking forward to your reply! (Platform: A100 on Google Colab)

rajendrac3 commented 1 year ago

I am also facing the same issue. I am running on AWS g5.8xlarge Were you able to solve it?

rajendrac3 commented 1 year ago

Since LlamaTokenizer does not have value for tokenizer.pad_token_id, its value is None. When this tokenizer is used to calculate the value of 'labels' it gives the error TypeError: 'NoneType' object cannot be interpreted as an integer

I assigned the value 0 to tokenizer.pad_token_id and the above error was resolved. But got another error Llama.forward() got an unexpected keyword argument 'labels'

To resolve this I replaced

model = AutoModel.from_pretrained( model_name, quantization_config=q_config, trust_remote_code=True )

with model = AutoModelForCausalLM.from_pretrained( model_name, quantization_config=q_config, trust_remote_code=True, device_map = "auto" )

and it worked fine

IshchenkoRoman commented 7 months ago

TL;DR: add tokenizer.pad_token_id = 0 in your code

Main problem, that code in ChatGPT relies on pad_token_id, that meta-LLAMA model doesn't use. But if check a little closer into special_tokens_map.json we can see next lines:

  "pad_token": "<unk>",
  "unk_token": {
    "content": "<unk>",
    "lstrip": false,
    "normalized": true,
    "rstrip": false,
    "single_word": false
  }

So as pad_token we can use id of unk_token which is equal to 0. To solve the problem with None, initialise field pad_token_id in tokenizer with value 0: tokenizer.pad_token_id = 0

AI4Finance-Foundation / FinGPT

It did not work when I try to convert the default model "chatglm2" to "llama2" #64