modelscope / ms-swift

Use PEFT or Full-parameter to finetune 400+ LLMs or 100+ MLLMs. (LLM: Qwen2.5, Llama3.2, GLM4, Internlm2.5, Yi1.5, Mistral, Baichuan2, DeepSeek, Gemma2, ...; MLLM: Qwen2-VL, Qwen2-Audio, Llama3.2-Vision, Llava, InternVL2, MiniCPM-V-2.6, GLM4v, Xcomposer2.5, Yi-VL, DeepSeek-VL, Phi3.5-Vision, ...)
https://swift.readthedocs.io/zh-cn/latest/Instruction/index.html
Apache License 2.0
4.44k stars 390 forks source link

Failed to import swift.llm.sft because of the following error #1898

Closed orzgugu closed 3 months ago

orzgugu commented 3 months ago

Hello, the following error occurs when I try to use internvl2-7b-instruct model fine-tuning.

RuntimeError: Failed to import swift.llm.sft because of the following error (look up to see its traceback): Failed to import swift.trainers.trainers because of the following error (look up to see its traceback): the first argument must be callable.

I found an error in from swift.llm import sft_main. May I ask what caused this problem? My ms-swift version is 2.5.0.dev0, transformers version is 4.45.0.dev0, torch version is 2.4.0. I am looking forward to your reply, which is very important to me

bonre commented 3 months ago

我也遇到了相同的问题,貌似是拉取了最新的版本导致的错误。上周原先能跑的脚本也无法运行了。

tastelikefeet commented 3 months ago

Can you share me the command?

orzgugu commented 3 months ago

你能分享给我这个命令吗?

CUDA_VISIBLE_DEVICES=0 swift sft --model_type qwen2-vl-7b-instruct --model_id_or_path /XXX/model/qwen/Qwen2-VL-7B-Instruct --dataset /XXX/pic_text_2w_swift-train-shuff-local.json --output_dir ./

tastelikefeet commented 3 months ago

你能分享给我这个命令吗?

CUDA_VISIBLE_DEVICES=0 swift sft --model_type qwen2-vl-7b-instruct --model_id_or_path /XXX/model/qwen/Qwen2-VL-7B-Instruct --dataset /XXX/pic_text_2w_swift-train-shuff-local.json --output_dir ./

Weird, I cannot reproduce, can you share me the full error stack please?

bonre commented 3 months ago

@tastelikefeet hi,我遇到的是相同的报错,这是我的脚本:

NPROC_PER_NODE=8 \
    swift sft \
        --model_type internvl2-8b \
        --model_id_or_path /SWIFT/InternVL/experience_2/checkpoint-16-merged \
        --output_dir /SWIFT/InternVL/experience_2/BestLR_E3_Pissa \
        --add_output_dir_suffix False \
        --dtype 'bf16' \
        --gradient_checkpointing true \
        --num_train_epochs 3 \
        --batch_size 1 \
        --eval_batch_size 1 \
        --gradient_accumulation_steps 16 \
        --sft_type 'lora' \
        --init_lora_weights 'pissa' \
        --lora_rank 128 \
        --lora_alpha 256 \
        --lora_dropout_p 0.1 \
        --use_flash_attn true \
        --dataset /data/0830/internvl2_20240830.json  \
        --val_dataset /data/val/internvl2_20240830_valid.json#-1 \
        --check_dataset_strategy 'warning' \
        --lr_scheduler_type 'cosine' \
        --learning_rate 1e-5 \
        --weight_decay 0.01 \
        --warmup_ratio 0.0075 \
        --max_grad_norm 1 \
        --optim 'adamw_torch' \
        --adam_beta1 0.9 \
        --adam_beta2 0.999 \
        --adam_epsilon 1e-8 \
        --evaluation_strategy 'steps' \
        --eval_steps 64 \
        --save_strategy 'epoch' \
        --max_length -1 \
        --deepspeed zero2-offload \
        --logging_steps 5

这段脚本在上周还可以正常运行,我是刚刚拉取了最新的main分支,source安装后,发现这段脚本居然无法运行了,报错信息如下:

Traceback (most recent call last):
  File "/home/workspace/ms-swift/swift/utils/import_utils.py", line 64, in _get_module
    return importlib.import_module('.' + module_name, self.__name__)
  File "/home/anaconda3/envs/ixc2/lib/python3.9/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1030, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1007, in _find_and_load
  File "<frozen importlib._bootstrap>", line 986, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 680, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 790, in exec_module
  File "<frozen importlib._bootstrap>", line 228, in _call_with_frames_removed
  File "/home/workspace/ms-swift/swift/trainers/trainers.py", line 23, in <module>
    from .push_to_ms import PushToMsHubMixin
  File "/home/workspace/ms-swift/swift/trainers/push_to_ms.py", line 16, in <module>
    class PushToMsHubMixin:
  File "/home/workspace/ms-swift/swift/trainers/push_to_ms.py", line 91, in PushToMsHubMixin
    huggingface_hub.upload_folder = partial(upload_folder, api)
TypeError: the first argument must be callable

其他还有类似的报错,我看类型都是这样的。

orzgugu commented 3 months ago

你能分享给我这个命令吗?

CUDA_VISIBLE_DEVICES=0 swift sft --model_type qwen2-vl-7b-instruct --model_id_or_path /XXX/model/qwen/Qwen2-VL-7B-Instruct --dataset /XXX/pic_text_2w_swift-train-shuff-local.json --output_dir ./

奇怪,我无法重现,你能与我分享完整的错误堆栈吗?

[INFO:swift] Successfully registered XX/conda_env/swift/swift/llm/data/dataset_info.json [INFO:swift] No vLLM installed, if you are using vLLM, you will get ImportError: cannot import name 'get_vllm_engine' from 'swift.llm' [INFO:swift] No LMDeploy installed, if you are using LMDeploy, you will get ImportError: cannot import name 'prepare_lmdeploy_engine_template' from 'swift.llm' Traceback (most recent call last): File "/XXX/swift/swift/utils/import_utils.py", line 64, in _get_module return importlib.import_module('.' + module_name, self.name) File "/home/pai/envs/XX_swift/lib/python3.9/importlib/init.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1030, in _gcd_import File "", line 1007, in _find_and_load File "", line 986, in _find_and_load_unlocked File "", line 680, in _load_unlocked File "", line 850, in exec_module File "", line 228, in _call_with_frames_removed File "/XXX/conda_env/swift/swift/trainers/trainers.py", line 23, in from .push_to_ms import PushToMsHubMixin File "/XXX/conda_env/swift/swift/trainers/push_to_ms.py", line 16, in class PushToMsHubMixin: File "/XXX/conda_env/swift/swift/trainers/push_to_ms.py", line 91, in PushToMsHubMixin huggingface_hub.upload_folder = partial(upload_folder, api) TypeError: the first argument must be callable

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "/XXX/conda_env/swift/swift/utils/import_utils.py", line 64, in _get_module return importlib.import_module('.' + module_name, self.name) File "/home/pai/envs/XXX_swift/lib/python3.9/importlib/init.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1030, in _gcd_import File "", line 1007, in _find_and_load File "", line 986, in _find_and_load_unlocked File "", line 680, in _load_unlocked File "", line 850, in exec_module File "", line 228, in _call_with_frames_removed File "/XXX/conda_env/swift/swift/llm/sft.py", line 17, in from swift.trainers import Seq2SeqTrainer File "", line 1055, in _handle_fromlist File "/XXX/conda_env/swift/swift/utils/import_utils.py", line 54, in getattr module = self._get_module(self._class_to_module[name]) File "/XXX/conda_env/swift/swift/utils/import_utils.py", line 66, in _get_module raise RuntimeError( RuntimeError: Failed to import swift.trainers.trainers because of the following error (look up to see its traceback): the first argument must be callable

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "/XXX/conda_env/swift/swift/cli/sft.py", line 2, in from swift.llm import sft_main File "", line 1055, in _handle_fromlist File "/XXX/conda_env/swift/swift/utils/import_utils.py", line 54, in getattr module = self._get_module(self._class_to_module[name]) File "/XXX/conda_env/swift/swift/utils/import_utils.py", line 66, in _get_module raise RuntimeError( RuntimeError: Failed to import swift.llm.sft because of the following error (look up to see its traceback): Failed to import swift.trainers.trainers because of the following error (look up to see its traceback): the first argument must be callable

Jintao-Huang commented 3 months ago

python的版本和transformers的版本是什么呢

orzgugu commented 3 months ago

python的版本和transformers的版本是什么呢

python 3.9 transformers 4.45.0.dev0

tastelikefeet commented 3 months ago

Seems to be a python version problem, let me check

orzgugu commented 3 months ago

Seems to be a python version problem, let me check

May I ask what version of python you are using?

tastelikefeet commented 3 months ago

Fetch the latest main branch, I think this problem has been fixed.

bonre commented 3 months ago

非常感谢你们的工作!现在问题解决了!:)

TayeeChang commented 3 months ago

I encountered the same problem when I used the latest main branch. I just run the pure line as follows: from swift.llm import sft_main

error happened: huggingface_hub.upload_folder = partial(upload_folder, api) TypeError: the first argument must be callable

The error happened as the upload_folder is not callable. Maybe it's a bug.

TayeeChang commented 3 months ago

非常感谢你们的工作!现在问题解决了!:)

怎么解决的呢?

bonre commented 3 months ago

非常感谢你们的工作!现在问题解决了!:)

怎么解决的呢?

拉取最新的main分支就可以了