SCIR-HI / Huatuo-Llama-Med-Chinese

Repo for BenTsao [original name: HuaTuo (华驼)], Instruction-tuning Large Language Models with Chinese Medical Knowledge. 本草(原名:华驼)模型仓库,基于中文医学知识的大语言模型指令微调
Apache License 2.0
4.31k stars 422 forks source link

我用A40微调出现了下列问题,推理没有问题,请问有大佬知道是什么原因吗? #100

Open yihp opened 6 months ago

yihp commented 6 months ago

(huatuo) root@autodl-container-cec311b53c-c2dea304:~/Huatuo-Llama-Med-Chinese-main# bash ./scripts/finetune.sh

===================================BUG REPORT=================================== Welcome to bitsandbytes. For bug reports, please submit your error trace to: https://github.com/TimDettmers/bitsandbytes/issues

/root/miniconda3/envs/huatuo/lib/python3.10/site-packages/bitsandbytes/cuda_setup/main.py:136: UserWarning: /root/miniconda3/envs/huatuo did not contain libcudart.so as expected! Searching further paths... warn(msg) /root/miniconda3/envs/huatuo/lib/python3.10/site-packages/bitsandbytes/cuda_setup/main.py:136: UserWarning: WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('/usr/local/nvidia/lib'), PosixPath('/usr/local/nvidia/lib64')} warn(msg) /root/miniconda3/envs/huatuo/lib/python3.10/site-packages/bitsandbytes/cuda_setup/main.py:136: UserWarning: /usr/local/nvidia/lib:/usr/local/nvidia/lib64 did not contain libcudart.so as expected! Searching further paths... warn(msg) /root/miniconda3/envs/huatuo/lib/python3.10/site-packages/bitsandbytes/cuda_setup/main.py:136: UserWarning: WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('Asia/Shanghai')} warn(msg) /root/miniconda3/envs/huatuo/lib/python3.10/site-packages/bitsandbytes/cuda_setup/main.py:136: UserWarning: WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('http'), PosixPath('8888/jupyter'), PosixPath('//autodl-container-cec311b53c-c2dea304')} warn(msg) /root/miniconda3/envs/huatuo/lib/python3.10/site-packages/bitsandbytes/cuda_setup/main.py:136: UserWarning: WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('http'), PosixPath('37135'), PosixPath('//192.168.1.88')} warn(msg) CUDA_SETUP: WARNING! libcudart.so not found in any environmental path. Searching /usr/local/cuda/lib64... CUDA SETUP: CUDA runtime path found: /usr/local/cuda/lib64/libcudart.so CUDA SETUP: Highest compute capability among GPUs detected: 8.6 CUDA SETUP: Detected CUDA version 111 CUDA SETUP: Loading binary /root/miniconda3/envs/huatuo/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cuda111.so... Training Alpaca-LoRA model with params: base_model: /root/autodl-tmp/llama-7b data_path: ./data/llama_data.json output_dir: ./lora-llama-med-e1 batch_size: 128 micro_batch_size: 128 num_epochs: 10 learning_rate: 0.0003 cutoff_len: 256 val_set_size: 500 lora_r: 8 lora_alpha: 16 lora_dropout: 0.05 lora_target_modules: ['q_proj', 'v_proj'] train_on_inputs: False group_by_length: False wandb_project: llama_med wandb_run_name: e1 wandb_watch: wandb_log_model: resume_from_checkpoint: False prompt template: med_template

The model weights are not tied. Please use the tie_weights method before using the infer_auto_device function. Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 33/33 [00:11<00:00, 2.86it/s] Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained. Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained. Found cached dataset json (file:///root/.cache/huggingface/datasets/json/default-081224d68fbae22b/0.0.0/fe5dd6ea2639a6df622901539cb550cf8797e5a6b2dd7af1cf934bed8e233e6e) Traceback (most recent call last): File "/root/Huatuo-Llama-Med-Chinese-main/finetune.py", line 289, in fire.Fire(train) File "/root/miniconda3/envs/huatuo/lib/python3.10/site-packages/fire/core.py", line 141, in Fire component_trace = _Fire(component, args, parsed_flag_args, context, name) File "/root/miniconda3/envs/huatuo/lib/python3.10/site-packages/fire/core.py", line 475, in _Fire component, remaining_args = _CallAndUpdateTrace( File "/root/miniconda3/envs/huatuo/lib/python3.10/site-packages/fire/core.py", line 691, in _CallAndUpdateTrace component = fn(*varargs, **kwargs) File "/root/Huatuo-Llama-Med-Chinese-main/finetune.py", line 184, in train data = load_dataset("json", data_files=data_path) File "/root/miniconda3/envs/huatuo/lib/python3.10/site-packages/datasets/load.py", line 1804, in load_dataset ds = builder_instance.as_dataset(split=split, verification_mode=verification_mode, in_memory=keep_in_memory) File "/root/miniconda3/envs/huatuo/lib/python3.10/site-packages/datasets/builder.py", line 1108, in as_dataset raise NotImplementedError(f"Loading a dataset cached in a {type(self._fs).name} is not supported.") NotImplementedError: Loading a dataset cached in a LocalFileSystem is not supported.

yihp commented 6 months ago

解决了,升级了transformers、atasets这两个包