command error: ''Adafactor is already registered in optimizer at torch.optim''

monteir03 commented 4 weeks ago

xtuner train cfg-gamma7b/gemma_7b_it_qlora_alpaca_e3_copy.py --deepspeed deepspeed_zero2 10/28 18:19:05 - mmengine - WARNING - WARNING: command error: ''Adafactor is already registered in optimizer at torch.optim''! 10/28 18:19:05 - mmengine - WARNING - Arguments received: ['xtuner', 'train', 'cfg-gamma7b/gemma_7b_it_qlora_alpaca_e3_copy.py', '--deepspeed', 'deepspeed_zero2']. xtuner commands use the following syntax:

    xtuner MODE MODE_ARGS ARGS

    Where   MODE (required) is one of ('list-cfg', 'copy-cfg', 'log-dataset', 'check-custom-dataset', 'train', 'test', 'chat', 'convert', 'preprocess', 'mmbench', 'eval_refcoco')
            MODE_ARG (optional) is the argument for specific mode
            ARGS (optional) are the arguments for specific command

Some usages for xtuner commands: (See more by using -h for specific command!)

    1. List all predefined configs:
        xtuner list-cfg
    2. Copy a predefined config to a given path:
        xtuner copy-cfg $CONFIG $SAVE_FILE
    3-1. Fine-tune LLMs by a single GPU:
        xtuner train $CONFIG
    3-2. Fine-tune LLMs by multiple GPUs:
        NPROC_PER_NODE=$NGPUS NNODES=$NNODES NODE_RANK=$NODE_RANK PORT=$PORT ADDR=$ADDR xtuner dist_train $CONFIG $GPUS
    4-1. Convert the pth model to HuggingFace's model:
        xtuner convert pth_to_hf $CONFIG $PATH_TO_PTH_MODEL $SAVE_PATH_TO_HF_MODEL
    4-2. Merge the HuggingFace's adapter to the pretrained base model:
        xtuner convert merge $LLM $ADAPTER $SAVE_PATH
        xtuner convert merge $CLIP $ADAPTER $SAVE_PATH --is-clip
    4-3. Split HuggingFace's LLM to the smallest sharded one:
        xtuner convert split $LLM $SAVE_PATH
    5-1. Chat with LLMs with HuggingFace's model and adapter:
        xtuner chat $LLM --adapter $ADAPTER --prompt-template $PROMPT_TEMPLATE --system-template $SYSTEM_TEMPLATE
    5-2. Chat with VLMs with HuggingFace's model and LLaVA:
        xtuner chat $LLM --llava $LLAVA --visual-encoder $VISUAL_ENCODER --image $IMAGE --prompt-template $PROMPT_TEMPLATE --system-template $SYSTEM_TEMPLATE
    6-1. Preprocess arxiv dataset:
        xtuner preprocess arxiv $SRC_FILE $DST_FILE --start-date $START_DATE --categories $CATEGORIES
    6-2. Preprocess refcoco dataset:
        xtuner preprocess refcoco --ann-path $RefCOCO_ANN_PATH --image-path $COCO_IMAGE_PATH --save-path $SAVE_PATH
    7-1. Log processed dataset:
        xtuner log-dataset $CONFIG
    7-2. Verify the correctness of the config file for the custom dataset:
        xtuner check-custom-dataset $CONFIG
    8. MMBench evaluation:
        xtuner mmbench $LLM --llava $LLAVA --visual-encoder $VISUAL_ENCODER --prompt-template $PROMPT_TEMPLATE --data-path $MMBENCH_DATA_PATH
    9. Refcoco evaluation:
        xtuner eval_refcoco $LLM --llava $LLAVA --visual-encoder $VISUAL_ENCODER --prompt-template $PROMPT_TEMPLATE --data-path $REFCOCO_DATA_PATH
    10. List all dataset formats which are supported in XTuner

Run special commands:

    xtuner help
    xtuner version

GitHub: https://github.com/InternLM/xtuner

libs used:

datasets 3.0.2 torch 2.5.0 torchvision 0.20.0 transformers 4.46.0 xtuner 0.1.23 peft 0.13.2 numpy 2.1.2 nvidia-cublas-cu12 12.4.5.8 nvidia-cuda-cupti-cu12 12.4.127 nvidia-cuda-nvrtc-cu12 12.4.127 nvidia-cuda-runtime-cu12 12.4.127 nvidia-cudnn-cu12 9.1.0.70 nvidia-cufft-cu12 11.2.1.3 nvidia-curand-cu12 10.3.5.147 nvidia-cusolver-cu12 11.6.1.9 nvidia-cusparse-cu12 12.3.1.170 nvidia-ml-py 12.560.30 nvidia-nccl-cu12 2.21.5 nvidia-nvjitlink-cu12 12.4.127 nvidia-nvtx-cu12 12.4.127

Cy6er7um commented 3 weeks ago

Same.

eliasyin commented 3 weeks ago

downgrage pytorch version to 2.4.1

pip install torch==2.4.1 torchvision==0.19.1 torchaudio==2.4.1

reference: https://github.com/InternLM/xtuner/issues/952#issuecomment-2434473868

monteir03 commented 3 weeks ago

downgrage pytorch version to 2.4.1
pip install torch==2.4.1 torchvision==0.19.1 torchaudio==2.4.1
reference: #952 (comment)

Fixed! Thx

minimum-generated-pig commented 2 days ago

按照 pip install torch==2.4.1 torchvision==0.19.1 torchaudio==2.4.1 修改后还是一样的报错怎么回事

InternLM / xtuner

command error: ''Adafactor is already registered in optimizer at torch.optim'' #957