(lmflow_train) root@duxact:/data/projects/lmflow/LMFlow# ./scripts/run_finetune_with_lisa.sh \
--model_name_or_path /data/guihunmodel8.8B \
--dataset_path /data/projects/lmflow/case_report_data \
--output_model_path /data/projects/lmflow/guihun_fintune_model \
--lisa_activated_layers 1 \
--lisa_interval_steps 20
[2024-05-22 14:32:20,602] [INFO] [real_accelerator.py:133:get_accelerator] Setting ds_accelerator to cuda (auto detect)
/root/anaconda3/envs/lmflow_train/lib/python3.9/site-packages/transformers/deepspeed.py:23: FutureWarning: transformers.deepspeed module is deprecated and will be removed in a future version. Please import deepspeed modules directly from transformers.integrations
warnings.warn(
Traceback (most recent call last):
File "/data/projects/lmflow/LMFlow/examples/finetune.py", line 61, in
main()
File "/data/projects/lmflow/LMFlow/examples/finetune.py", line 44, in main
model_args, data_args, pipeline_args = parser.parse_args_into_dataclasses()
File "/root/anaconda3/envs/lmflow_train/lib/python3.9/site-packages/transformers/hf_argparser.py", line 339, in parse_args_into_dataclasses
obj = dtype(**inputs)
File "", line 135, in init
File "/root/anaconda3/envs/lmflow_train/lib/python3.9/site-packages/transformers/training_args.py", line 1641, in __post_init
and (self.device.type == "cpu" and not is_torch_greater_or_equal_than_2_3)
File "/root/anaconda3/envs/lmflow_train/lib/python3.9/site-packages/transformers/training_args.py", line 2149, in device
return self._setup_devices
File "/root/anaconda3/envs/lmflow_train/lib/python3.9/site-packages/transformers/utils/generic.py", line 59, in get
cached = self.fget(obj)
File "/root/anaconda3/envs/lmflow_train/lib/python3.9/site-packages/transformers/training_args.py", line 2081, in _setup_devices
self.distributed_state = PartialState(
File "/root/anaconda3/envs/lmflow_train/lib/python3.9/site-packages/accelerate/state.py", line 293, in init__
raise NotImplementedError(
NotImplementedError: Using RTX 4000 series doesn't support faster communication broadband via P2P or IB. Please set NCCL_P2P_DISABLE="1" and NCCL_IB_DISABLE="1" or useaccelerate launch` which will do this automatically.
Thanks for your interest in LMFlow! Currently we are working on the full multi-GPU support for LISA. Please stay tuned for our latest update, thanks for your understanding 🙏
(lmflow_train) root@duxact:/data/projects/lmflow/LMFlow# ./scripts/run_finetune_with_lisa.sh \ --model_name_or_path /data/guihunmodel8.8B \ --dataset_path /data/projects/lmflow/case_report_data \ --output_model_path /data/projects/lmflow/guihun_fintune_model \ --lisa_activated_layers 1 \ --lisa_interval_steps 20 [2024-05-22 14:32:20,602] [INFO] [real_accelerator.py:133:get_accelerator] Setting ds_accelerator to cuda (auto detect) /root/anaconda3/envs/lmflow_train/lib/python3.9/site-packages/transformers/deepspeed.py:23: FutureWarning: transformers.deepspeed module is deprecated and will be removed in a future version. Please import deepspeed modules directly from transformers.integrations warnings.warn( Traceback (most recent call last): File "/data/projects/lmflow/LMFlow/examples/finetune.py", line 61, in
main()
File "/data/projects/lmflow/LMFlow/examples/finetune.py", line 44, in main
model_args, data_args, pipeline_args = parser.parse_args_into_dataclasses()
File "/root/anaconda3/envs/lmflow_train/lib/python3.9/site-packages/transformers/hf_argparser.py", line 339, in parse_args_into_dataclasses
obj = dtype(**inputs)
File "", line 135, in init
File "/root/anaconda3/envs/lmflow_train/lib/python3.9/site-packages/transformers/training_args.py", line 1641, in __post_init
and (self.device.type == "cpu" and not is_torch_greater_or_equal_than_2_3)
File "/root/anaconda3/envs/lmflow_train/lib/python3.9/site-packages/transformers/training_args.py", line 2149, in device
return self._setup_devices
File "/root/anaconda3/envs/lmflow_train/lib/python3.9/site-packages/transformers/utils/generic.py", line 59, in get
cached = self.fget(obj)
File "/root/anaconda3/envs/lmflow_train/lib/python3.9/site-packages/transformers/training_args.py", line 2081, in _setup_devices
self.distributed_state = PartialState(
File "/root/anaconda3/envs/lmflow_train/lib/python3.9/site-packages/accelerate/state.py", line 293, in init__
raise NotImplementedError(
NotImplementedError: Using RTX 4000 series doesn't support faster communication broadband via P2P or IB. Please set
NCCL_P2P_DISABLE="1"
andNCCL_IB_DISABLE="1" or use
accelerate launch` which will do this automatically.