Open sudusuperman opened 1 year ago
"The dtype of attention mask (torch.int64) is not bool" is a warning that can be ignored. Maybe the error is reported because some other reasons like "out of memory". You can add "--debug" to show the error. If the error is "out of memory", please set max_token smaller, like 64 or 32
python inference.py --model_type chatglm --instruction "Who are you?" --model_path "LLMs/chatglm/chatglm-6b/" --adapter_weights "output/chatglm" --max_new_tokens 64 --debug
Maybe you should check how official used model.generate(), I met the same problem in Qwen. I replaced the call method with official guidance and it generated the things I needed.
I met the same inference error after I finetune the Qwen-7b. The inference error message is
LLM says:
Eval Error
when I add --debug
the the command,
python3 inference.py --model_type qwen --instruction "Who are you?" --input "" --model_path $model_path --adapter_weights $output_dir --max_new_tokens 10 --debug
the error message states that expected all tensors to be on the same device, but found at least two devices, cuda:0 and cuda:1!
The gpu I use is 8 v100 gpu.
The error is in the the code
model.generate
I think it is about the device of the model mismatch with the device of input_ids. But I failed to fix this bug.
I have to add export CUDA_VISIBLE_DEVICES=0
to only use 1 gpu to avoid this error. But I wonder how to use multi-gpu to infer.
I'm testing ChatGLM After following the instructions in README.md
python finetune.py --model_type chatglm --data "data/train/" --model_path "LLMs/chatglm/chatglm-6b/" --adapter "lora" --output_dir "output/chatglm"
And then
python inference.py --model_type chatglm --instruction "Who are you?" --model_path "LLMs/chatglm/chatglm-6b/" --adapter_weights "output/chatglm" --max_new_tokens 256
I get: ... Loading checkpoint shards: 100%|██████████████████| 8/8 [00:04<00:00, 1.73it/s] Find 1 cases The dtype of attention mask (torch.int64) is not bool
LLM says: Eval Error
Reproduce:
1 Training data that I'm using: {"id": "seed_task_0", "name": "breakfast_suggestion", "instruction": "who are you??", "input": "Who are you?", "output": "Yes, you can have 1 oatmeal banana protein shake and 4 strips of bacon. The oatmeal banana protein shake may contain 1/2 cup oatmeal, 60 grams whey protein powder, 1/2 medium banana, 1tbsp flaxseed oil and 1/2 cup watter, totalling about 550 calories. The 4 strips of bacon contains about 200 calories.", "is_classification": false} {"id": "seed_task_1", "name": "breakfast_suggestion", "instruction": "who are you??", "input": "Who are you?", "output": "Yes, you can have 1 oatmeal banana protein shake and 4 strips of bacon. The oatmeal banana protein shake may contain 1/2 cup oatmeal, 60 grams whey protein powder, 1/2 medium banana, 1tbsp flaxseed oil and 1/2 cup watter, totalling about 550 calories. The 4 strips of bacon contains about 200 calories.", "is_classification": false} {"id": "seed_task_2", "name": "breakfast_suggestion", "instruction": "who are you??", "input": "Who are you?", "output": "Yes, you can have 1 oatmeal banana protein shake and 4 strips of bacon. The oatmeal banana protein shake may contain 1/2 cup oatmeal, 60 grams whey protein powder, 1/2 medium banana, 1tbsp flaxseed oil and 1/2 cup watter, totalling about 550 calories. The 4 strips of bacon contains about 200 calories.", "is_classification": false} {"id": "seed_task_3", "name": "breakfast_suggestion", "instruction": "who are you??", "input": "Who are you?", "output": "Yes, you can have 1 oatmeal banana protein shake and 4 strips of bacon. The oatmeal banana protein shake may contain 1/2 cup oatmeal, 60 grams whey protein powder, 1/2 medium banana, 1tbsp flaxseed oil and 1/2 cup watter, totalling about 550 calories. The 4 strips of bacon contains about 200 calories.", "is_classification": false}
2 Some dependency version different from requirement.txt: torch==2.1.0.dev20230830 torchaudio==2.1.0.dev20230830 torchvision==0.16.0.dev20230830 icetk==0.0.4