Repo for BenTsao [original name: HuaTuo (华驼)], Instruction-tuning Large Language Models with Chinese Medical Knowledge. 本草(原名:华驼)模型仓库,基于中文医学知识的大语言模型指令微调
===================================BUG REPORT===================================
Welcome to bitsandbytes. For bug reports, please submit your error trace to: https://github.com/TimDettmers/bitsandbytes/issues
/root/miniconda3/envs/huatuo/lib/python3.10/site-packages/bitsandbytes/cuda_setup/main.py:136: UserWarning: /root/miniconda3/envs/huatuo did not contain libcudart.so as expected! Searching further paths...
warn(msg)
/root/miniconda3/envs/huatuo/lib/python3.10/site-packages/bitsandbytes/cuda_setup/main.py:136: UserWarning: WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('/usr/local/nvidia/lib'), PosixPath('/usr/local/nvidia/lib64')}
warn(msg)
/root/miniconda3/envs/huatuo/lib/python3.10/site-packages/bitsandbytes/cuda_setup/main.py:136: UserWarning: /usr/local/nvidia/lib:/usr/local/nvidia/lib64 did not contain libcudart.so as expected! Searching further paths...
warn(msg)
/root/miniconda3/envs/huatuo/lib/python3.10/site-packages/bitsandbytes/cuda_setup/main.py:136: UserWarning: WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('Asia/Shanghai')}
warn(msg)
/root/miniconda3/envs/huatuo/lib/python3.10/site-packages/bitsandbytes/cuda_setup/main.py:136: UserWarning: WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('http'), PosixPath('8888/jupyter'), PosixPath('//autodl-container-cec311b53c-c2dea304')}
warn(msg)
/root/miniconda3/envs/huatuo/lib/python3.10/site-packages/bitsandbytes/cuda_setup/main.py:136: UserWarning: WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('http'), PosixPath('37135'), PosixPath('//192.168.1.88')}
warn(msg)
CUDA_SETUP: WARNING! libcudart.so not found in any environmental path. Searching /usr/local/cuda/lib64...
CUDA SETUP: CUDA runtime path found: /usr/local/cuda/lib64/libcudart.so
CUDA SETUP: Highest compute capability among GPUs detected: 8.6
CUDA SETUP: Detected CUDA version 111
CUDA SETUP: Loading binary /root/miniconda3/envs/huatuo/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cuda111.so...
Training Alpaca-LoRA model with params:
base_model: /root/autodl-tmp/llama-7b
data_path: ./data/llama_data.json
output_dir: ./lora-llama-med-e1
batch_size: 128
micro_batch_size: 128
num_epochs: 10
learning_rate: 0.0003
cutoff_len: 256
val_set_size: 500
lora_r: 8
lora_alpha: 16
lora_dropout: 0.05
lora_target_modules: ['q_proj', 'v_proj']
train_on_inputs: False
group_by_length: False
wandb_project: llama_med
wandb_run_name: e1
wandb_watch:
wandb_log_model:
resume_from_checkpoint: False
prompt template: med_template
The model weights are not tied. Please use the tie_weights method before using the infer_auto_device function.
Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 33/33 [00:11<00:00, 2.86it/s]
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Found cached dataset json (file:///root/.cache/huggingface/datasets/json/default-081224d68fbae22b/0.0.0/fe5dd6ea2639a6df622901539cb550cf8797e5a6b2dd7af1cf934bed8e233e6e)
Traceback (most recent call last):
File "/root/Huatuo-Llama-Med-Chinese-main/finetune.py", line 289, in
fire.Fire(train)
File "/root/miniconda3/envs/huatuo/lib/python3.10/site-packages/fire/core.py", line 141, in Fire
component_trace = _Fire(component, args, parsed_flag_args, context, name)
File "/root/miniconda3/envs/huatuo/lib/python3.10/site-packages/fire/core.py", line 475, in _Fire
component, remaining_args = _CallAndUpdateTrace(
File "/root/miniconda3/envs/huatuo/lib/python3.10/site-packages/fire/core.py", line 691, in _CallAndUpdateTrace
component = fn(*varargs, **kwargs)
File "/root/Huatuo-Llama-Med-Chinese-main/finetune.py", line 184, in train
data = load_dataset("json", data_files=data_path)
File "/root/miniconda3/envs/huatuo/lib/python3.10/site-packages/datasets/load.py", line 1804, in load_dataset
ds = builder_instance.as_dataset(split=split, verification_mode=verification_mode, in_memory=keep_in_memory)
File "/root/miniconda3/envs/huatuo/lib/python3.10/site-packages/datasets/builder.py", line 1108, in as_dataset
raise NotImplementedError(f"Loading a dataset cached in a {type(self._fs).name} is not supported.")
NotImplementedError: Loading a dataset cached in a LocalFileSystem is not supported.
(huatuo) root@autodl-container-cec311b53c-c2dea304:~/Huatuo-Llama-Med-Chinese-main# bash ./scripts/finetune.sh
===================================BUG REPORT=================================== Welcome to bitsandbytes. For bug reports, please submit your error trace to: https://github.com/TimDettmers/bitsandbytes/issues
/root/miniconda3/envs/huatuo/lib/python3.10/site-packages/bitsandbytes/cuda_setup/main.py:136: UserWarning: /root/miniconda3/envs/huatuo did not contain libcudart.so as expected! Searching further paths... warn(msg) /root/miniconda3/envs/huatuo/lib/python3.10/site-packages/bitsandbytes/cuda_setup/main.py:136: UserWarning: WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('/usr/local/nvidia/lib'), PosixPath('/usr/local/nvidia/lib64')} warn(msg) /root/miniconda3/envs/huatuo/lib/python3.10/site-packages/bitsandbytes/cuda_setup/main.py:136: UserWarning: /usr/local/nvidia/lib:/usr/local/nvidia/lib64 did not contain libcudart.so as expected! Searching further paths... warn(msg) /root/miniconda3/envs/huatuo/lib/python3.10/site-packages/bitsandbytes/cuda_setup/main.py:136: UserWarning: WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('Asia/Shanghai')} warn(msg) /root/miniconda3/envs/huatuo/lib/python3.10/site-packages/bitsandbytes/cuda_setup/main.py:136: UserWarning: WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('http'), PosixPath('8888/jupyter'), PosixPath('//autodl-container-cec311b53c-c2dea304')} warn(msg) /root/miniconda3/envs/huatuo/lib/python3.10/site-packages/bitsandbytes/cuda_setup/main.py:136: UserWarning: WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('http'), PosixPath('37135'), PosixPath('//192.168.1.88')} warn(msg) CUDA_SETUP: WARNING! libcudart.so not found in any environmental path. Searching /usr/local/cuda/lib64... CUDA SETUP: CUDA runtime path found: /usr/local/cuda/lib64/libcudart.so CUDA SETUP: Highest compute capability among GPUs detected: 8.6 CUDA SETUP: Detected CUDA version 111 CUDA SETUP: Loading binary /root/miniconda3/envs/huatuo/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cuda111.so... Training Alpaca-LoRA model with params: base_model: /root/autodl-tmp/llama-7b data_path: ./data/llama_data.json output_dir: ./lora-llama-med-e1 batch_size: 128 micro_batch_size: 128 num_epochs: 10 learning_rate: 0.0003 cutoff_len: 256 val_set_size: 500 lora_r: 8 lora_alpha: 16 lora_dropout: 0.05 lora_target_modules: ['q_proj', 'v_proj'] train_on_inputs: False group_by_length: False wandb_project: llama_med wandb_run_name: e1 wandb_watch: wandb_log_model: resume_from_checkpoint: False prompt template: med_template
The model weights are not tied. Please use the
fire.Fire(train)
File "/root/miniconda3/envs/huatuo/lib/python3.10/site-packages/fire/core.py", line 141, in Fire
component_trace = _Fire(component, args, parsed_flag_args, context, name)
File "/root/miniconda3/envs/huatuo/lib/python3.10/site-packages/fire/core.py", line 475, in _Fire
component, remaining_args = _CallAndUpdateTrace(
File "/root/miniconda3/envs/huatuo/lib/python3.10/site-packages/fire/core.py", line 691, in _CallAndUpdateTrace
component = fn(*varargs, **kwargs)
File "/root/Huatuo-Llama-Med-Chinese-main/finetune.py", line 184, in train
data = load_dataset("json", data_files=data_path)
File "/root/miniconda3/envs/huatuo/lib/python3.10/site-packages/datasets/load.py", line 1804, in load_dataset
ds = builder_instance.as_dataset(split=split, verification_mode=verification_mode, in_memory=keep_in_memory)
File "/root/miniconda3/envs/huatuo/lib/python3.10/site-packages/datasets/builder.py", line 1108, in as_dataset
raise NotImplementedError(f"Loading a dataset cached in a {type(self._fs).name} is not supported.")
NotImplementedError: Loading a dataset cached in a LocalFileSystem is not supported.
tie_weights
method before using theinfer_auto_device
function. Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 33/33 [00:11<00:00, 2.86it/s] Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained. Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained. Found cached dataset json (file:///root/.cache/huggingface/datasets/json/default-081224d68fbae22b/0.0.0/fe5dd6ea2639a6df622901539cb550cf8797e5a6b2dd7af1cf934bed8e233e6e) Traceback (most recent call last): File "/root/Huatuo-Llama-Med-Chinese-main/finetune.py", line 289, in