ZRC77 commented 1 year ago

我修改模型为chatGLM2-6B后出现这样的错误是什么原因？

python3 train_qlora.py --train_args_json chatGLM_6B_QLoRA.json --model_name_or_path THUDM/chatglm2-6b --train_data_path data/train.jsonl --eval_data_path data/dev.jsonl --lora_rank 4 --lora_dropout 0.05 --compute_dtype fp32

===================================BUG REPORT=================================== Welcome to bitsandbytes. For bug reports, please run

python -m bitsandbytes

issues

bin /usr/local/lib/python3.8/dist-packages/bitsandbytes/libbitsandbytes_cuda117.so /usr/local/lib/python3.8/dist-packages/bitsandbytes/cuda_setup/main.py:149: UserWarning: WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('/usr/local/nvidia/lib'), PosixPath('/usr/local/nvidia/lib64')} warn(msg) /usr/local/lib/python3.8/dist-packages/bitsandbytes/cuda_setup/main.py:149: UserWarning: /usr/local/nvidia/lib:/usr/local/nvidia/lib64 did not contain ['libcudart.so', 'libcudart.so.11.0', 'libcudart.so.12.0'] as expected! Searching further paths... warn(msg) CUDA_SETUP: WARNING! libcudart.so not found in any environmental path. Searching in backup paths... /usr/local/lib/python3.8/dist-packages/bitsandbytes/cuda_setup/main.py:149: UserWarning: Found duplicate ['libcudart.so', 'libcudart.so.11.0', 'libcudart.so.12.0'] files: {PosixPath('/usr/local/cuda/lib64/libcudart.so'), PosixPath('/usr/local/cuda/lib64/libcudart.so.11.0')}.. We'll flip a coin and try one of these, in order to fail forward. Either way, this might cause trouble in the future: If you get CUDA error: invalid device function errors, the above might be the cause and the solution is to make sure only one ['libcudart.so', 'libcudart.so.11.0', 'libcudart.so.12.0'] in the paths that we search based on your env. warn(msg) CUDA SETUP: CUDA runtime path found: /usr/local/cuda/lib64/libcudart.so CUDA SETUP: Highest compute capability among GPUs detected: 8.9 CUDA SETUP: Detected CUDA version 117 CUDA SETUP: Loading binary /usr/local/lib/python3.8/dist-packages/bitsandbytes/libbitsandbytes_cuda117.so... You are loading your model in 8bit or 4bit but no linear modules were found in your model. this can happen for some architectures such as gpt2 that uses Conv1D instead of Linear layers. Please double check your model architecture, or submit an issue on github if you think this is a bug. Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████| 7/7 [00:07<00:00, 1.05s/it] trainable params: 974,848 || all params: 6,244,558,848 || trainable%: 0.01561115883009451 Found cached dataset json (/root/.cache/huggingface/datasets/json/default-d5629da83678d2e9/0.0.0/e347ab1c932092252e717ff3f949105a4dd28b27e842dd53157d2f72e276c2e4) 100%|█████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 16.56it/s] Traceback (most recent call last): File "train_qlora.py", line 203, in train(args) File "train_qlora.py", line 181, in train train_dataset = get_datset(global_args.train_data_path, tokenizer, global_args) File "train_qlora.py", line 82, in get_datset dataset = data['train'].map(lambda example: tokenize_func(example, tokenizer, global_args), File "/usr/local/lib/python3.8/dist-packages/datasets/arrow_dataset.py", line 578, in wrapper out: Union["Dataset", "DatasetDict"] = func(self, *args, *kwargs) File "/usr/local/lib/python3.8/dist-packages/datasets/arrow_dataset.py", line 543, in wrapper out: Union["Dataset", "DatasetDict"] = func(self, args, kwargs) File "/usr/local/lib/python3.8/dist-packages/datasets/arrow_dataset.py", line 3073, in map for rank, done, content in Dataset._map_single(dataset_kwargs): File "/usr/local/lib/python3.8/dist-packages/datasets/arrow_dataset.py", line 3427, in _map_single example = apply_function_on_filtered_inputs(example, i, offset=offset) File "/usr/local/lib/python3.8/dist-packages/datasets/arrow_dataset.py", line 3330, in apply_function_on_filtered_inputs processed_inputs = function(fn_args, additional_args, **fn_kwargs) File "train_qlora.py", line 82, in dataset = data['train'].map(lambda example: tokenize_func(example, tokenizer, global_args), File "train_qlora.py", line 73, in tokenize_func question_length = input_ids.index(tokenizer.bos_token_id) ValueError: None is not in list

shuxueslpi commented 1 year ago

@ZRC77 chatglm2-6b的微调我正在调试，会尽快更新。

shuxueslpi commented 1 year ago

@ZRC77 代码已更新

shuxueslpi / chatGLM-6B-QLoRA

修改模型后出错， #3

and submit this information together with your error trace to: https://github.com/TimDettmers/bitsandbytes/issues