Open Dominic23331 opened 4 months ago
请问解决了吗?你贴的信息有点多,麻烦可以说一下是哪个配置文件?启动命令?以及你是否哪里做了修改吗
07/29 22:44:31 - mmengine - WARNING - Failed to search registry with scope "mmengine" in the "builder" registry tree. As a workaround, the current "builder" registry in "xtuner" is used to build instance. This may cause unexpected failure when running the built modules. Please check whether "mmengine" is a correct scope, or whether the registry is initialized.
.甚至于卡在此处50分钟一动不动,也没有GPU和VRAM占用.我在conda环境中pip install xtuner[deepspeed]
自动安装了0.1.23版本.我修改config文件,下载解压相应的模型,然后在vscode的jupyter中运行训练指令:
!/home/aistudio/.local/bin/xtuner train \
/home/aistudio/llava/llava_internlm2_chat_1_8b_qlora_clip_vit_large_p14_336_lora_e1_gpu8_finetune_copy.py \
--deepspeed deepspeed_zero2
之后出现了 如第一行的报错 这是我的输出
/home/aistudio/llava
[2024-07-29 22:44:15,456] [INFO] [real_accelerator.py:203:get_accelerator] Setting ds_accelerator to cuda (auto detect)
[WARNING] async_io requires the dev libaio .so object and headers but these were not found.
[WARNING] async_io: please install the libaio-dev package with apt
[WARNING] If libaio is already installed (perhaps from source), try setting the CFLAGS and LDFLAGS environment variables to where it can be found.
[WARNING] Please specify the CUTLASS repo directory as environment variable $CUTLASS_PATH
[WARNING] NVIDIA Inference is only supported on Ampere and newer architectures
[WARNING] sparse_attn requires a torch version >= 1.5 and < 2.0 but detected 2.0
[WARNING] using untested triton version (2.0.0), only 1.0.0 is known to be compatible
[2024-07-29 22:44:23,984] [INFO] [real_accelerator.py:203:get_accelerator] Setting ds_accelerator to cuda (auto detect)
[WARNING] async_io requires the dev libaio .so object and headers but these were not found.
[WARNING] async_io: please install the libaio-dev package with apt
[WARNING] If libaio is already installed (perhaps from source), try setting the CFLAGS and LDFLAGS environment variables to where it can be found.
[WARNING] Please specify the CUTLASS repo directory as environment variable $CUTLASS_PATH
[WARNING] NVIDIA Inference is only supported on Ampere and newer architectures
[WARNING] sparse_attn requires a torch version >= 1.5 and < 2.0 but detected 2.0
[WARNING] using untested triton version (2.0.0), only 1.0.0 is known to be compatible
07/29 22:44:28 - mmengine - INFO -
------------------------------------------------------------
System environment:
sys.platform: linux
Python: 3.10.10 (main, Mar 21 2023, 18:45:11) [GCC 11.2.0]
CUDA available: True
MUSA available: False
numpy_random_seed: 1052795933
GPU 0: Tesla V100-SXM2-32GB
CUDA_HOME: /usr/local/cuda
NVCC: Cuda compilation tools, release 11.8, V11.8.89
GCC: gcc (Ubuntu 9.4.0-1ubuntu1~20.04.2) 9.4.0
PyTorch: 2.0.1+cu118
PyTorch compiling details: PyTorch built with:
- GCC 9.3
- C++ Version: 201703
- Intel(R) oneAPI Math Kernel Library Version 2022.2-Product Build 20220804 for Intel(R) 64 architecture applications
- Intel(R) MKL-DNN v2.7.3 (Git Hash 6dbeffbae1f23cbbeae17adb7b5b13f1f37c080e)
- OpenMP 201511 (a.k.a. OpenMP 4.5)
- LAPACK is enabled (usually provided by MKL)
- NNPACK is enabled
- CPU capability usage: AVX2
- CUDA Runtime 11.8
- NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_90,code=sm_90
- CuDNN 8.9.6
- Built with CuDNN 8.7
- Magma 2.6.1
- Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.8, CUDNN_VERSION=8.7.0, CXX_COMPILER=/opt/rh/devtoolset-9/root/usr/bin/c++, CXX_FLAGS= -D_GLIBCXX_USE_CXX11_ABI=0 -fabi-version=11 -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOROCTRACER -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -Wall -Wextra -Werror=return-type -Werror=non-virtual-dtor -Werror=bool-operation -Wnarrowing -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wunused-local-typedefs -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Werror=cast-function-type -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_DISABLE_GPU_ASSERTS=ON, TORCH_VERSION=2.0.1, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=1, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF,
TorchVision: 0.15.2+cu118
OpenCV: 4.10.0
MMEngine: 0.10.4
07/29 22:44:30 - mmengine - INFO - Config: SYSTEM = '' accumulative_counts = 1 batch_size = 1 betas = ( 0.9, 0.999, ) custom_hooks = [ dict( tokenizer=dict( padding_side='right', pretrained_model_name_or_path='/home/aistudio/data/internlm2-1_8b', trust_remote_code=True, type='transformers.AutoTokenizer.from_pretrained'), type='xtuner.engine.hooks.DatasetInfoHook'), dict( evaluation_images='https://llava-vl.github.io/static/images/view.jpg', evaluation_inputs=[ 'Please describe this picture', 'What is the equipment in the image?', ], every_n_iters=500, image_processor=dict( pretrained_model_name_or_path= '/home/aistudio/.cache/modelscope/hub/AI-ModelScope/clip-vit-large-patch14-336', trust_remote_code=True, type='transformers.CLIPImageProcessor.from_pretrained'), prompt_template='xtuner.utils.PROMPT_TEMPLATE.internlm2_chat', system='', tokenizer=dict( padding_side='right', pretrained_model_name_or_path='/home/aistudio/data/internlm2-1_8b', trust_remote_code=True, type='transformers.AutoTokenizer.from_pretrained'), type='xtuner.engine.hooks.EvaluateChatHook'), ] data_path = '/home/aistudio/llava/llava_data/repeated_data.json' data_root = '/home/aistudio/llava/llava_data/' dataloader_num_workers = 4 default_hooks = dict( checkpoint=dict( by_epoch=False, interval=500, max_keep_ckpts=2, type='mmengine.hooks.CheckpointHook'), logger=dict( interval=10, log_metric_by_epoch=False, type='mmengine.hooks.LoggerHook'), param_scheduler=dict(type='mmengine.hooks.ParamSchedulerHook'), sampler_seed=dict(type='mmengine.hooks.DistSamplerSeedHook'), timer=dict(type='mmengine.hooks.IterTimerHook')) env_cfg = dict( cudnn_benchmark=False, dist_cfg=dict(backend='nccl'), mp_cfg=dict(mp_start_method='fork', opencv_num_threads=0)) evaluation_freq = 500 evaluation_images = 'https://llava-vl.github.io/static/images/view.jpg' evaluation_inputs = [ 'Please describe this picture', 'What is the equipment in the image?', ] image_folder = '/home/aistudio/llava/llava_data/' image_processor = dict( pretrained_model_name_or_path= '/home/aistudio/.cache/modelscope/hub/AI-ModelScope/clip-vit-large-patch14-336', trust_remote_code=True, type='transformers.CLIPImageProcessor.from_pretrained') launcher = 'none' llava_dataset = dict( data_path='/home/aistudio/llava/llava_data/repeated_data.json', dataset_map_fn='xtuner.dataset.map_fns.llava_map_fn', image_folder='/home/aistudio/llava/llava_data/', image_processor=dict( pretrained_model_name_or_path= '/home/aistudio/.cache/modelscope/hub/AI-ModelScope/clip-vit-large-patch14-336', trust_remote_code=True, type='transformers.CLIPImageProcessor.from_pretrained'), max_length=1472, pad_image_to_square=True, template_map_fn=dict( template='xtuner.utils.PROMPT_TEMPLATE.internlm2_chat', type='xtuner.dataset.map_fns.template_map_fn_factory'), tokenizer=dict( padding_side='right', pretrained_model_name_or_path='/home/aistudio/data/internlm2-1_8b', trust_remote_code=True, type='transformers.AutoTokenizer.from_pretrained'), type='xtuner.dataset.LLaVADataset') llm_name_or_path = '/home/aistudio/data/internlm2-1_8b' load_from = None log_level = 'INFO' log_processor = dict(by_epoch=False) lr = 0.0002 max_epochs = 1 max_length = 1472 max_norm = 1 model = dict( freeze_llm=True, freeze_visual_encoder=True, llm=dict( pretrained_model_name_or_path='/home/aistudio/data/internlm2-1_8b', quantization_config=dict( bnb_4bit_compute_dtype='torch.float16', bnb_4bit_quant_type='nf4', bnb_4bit_use_double_quant=True, llm_int8_has_fp16_weight=False, llm_int8_threshold=6.0, load_in_4bit=True, load_in_8bit=False, type='transformers.BitsAndBytesConfig'), torch_dtype='torch.float16', trust_remote_code=True, type='transformers.AutoModelForCausalLM.from_pretrained'), llm_lora=dict( bias='none', lora_alpha=256, lora_dropout=0.05, r=512, task_type='CAUSAL_LM', type='peft.LoraConfig'), pretrained_pth='/home/aistudio/llava/iter_2181.pth', type='xtuner.model.LLaVAModel', visual_encoder=dict( pretrained_model_name_or_path= '/home/aistudio/.cache/modelscope/hub/AI-ModelScope/clip-vit-large-patch14-336', type='transformers.CLIPVisionModel.from_pretrained'), visual_encoder_lora=dict( bias='none', lora_alpha=16, lora_dropout=0.05, r=64, type='peft.LoraConfig')) optim_type = 'torch.optim.AdamW' optim_wrapper = dict( optimizer=dict( betas=( 0.9, 0.999, ), lr=0.0002, type='torch.optim.AdamW', weight_decay=0), type='DeepSpeedOptimWrapper') param_scheduler = [ dict( begin=0, by_epoch=True, convert_to_iter_based=True, end=0.03, start_factor=1e-05, type='mmengine.optim.LinearLR'), dict( begin=0.03, by_epoch=True, convert_to_iter_based=True, end=1, eta_min=0.0, type='mmengine.optim.CosineAnnealingLR'), ] pretrained_pth = '/home/aistudio/llava/iter_2181.pth' prompt_template = 'xtuner.utils.PROMPT_TEMPLATE.internlm2_chat' randomness = dict(deterministic=False, seed=None) resume = False runner_type = 'FlexibleRunner' save_steps = 500 save_total_limit = 2 strategy = dict( config=dict( bf16=dict(enabled=False), fp16=dict(enabled=True, initial_scale_power=16), gradient_accumulation_steps='auto', gradient_clipping='auto', train_micro_batch_size_per_gpu='auto', zero_allow_untested_optimizer=True, zero_force_ds_cpu_optimizer=False, zero_optimization=dict(overlap_comm=True, stage=2)), exclude_frozen_parameters=True, gradient_accumulation_steps=1, gradient_clipping=1, sequence_parallel_size=1, train_micro_batch_size_per_gpu=1, type='xtuner.engine.DeepSpeedStrategy') tokenizer = dict( padding_side='right', pretrained_model_name_or_path='/home/aistudio/data/internlm2-1_8b', trust_remote_code=True, type='transformers.AutoTokenizer.from_pretrained') train_cfg = dict(max_epochs=1, type='xtuner.engine.runner.TrainLoop') train_dataloader = dict( batch_size=1, collate_fn=dict(type='xtuner.dataset.collate_fns.default_collate_fn'), dataset=dict( data_path='/home/aistudio/llava/llava_data/repeated_data.json', dataset_map_fn='xtuner.dataset.map_fns.llava_map_fn', image_folder='/home/aistudio/llava/llava_data/', image_processor=dict( pretrained_model_name_or_path= '/home/aistudio/.cache/modelscope/hub/AI-ModelScope/clip-vit-large-patch14-336', trust_remote_code=True, type='transformers.CLIPImageProcessor.from_pretrained'), max_length=1472, pad_image_to_square=True, template_map_fn=dict( template='xtuner.utils.PROMPT_TEMPLATE.internlm2_chat', type='xtuner.dataset.map_fns.template_map_fn_factory'), tokenizer=dict( padding_side='right', pretrained_model_name_or_path='/home/aistudio/data/internlm2-1_8b', trust_remote_code=True, type='transformers.AutoTokenizer.from_pretrained'), type='xtuner.dataset.LLaVADataset'), num_workers=4, pin_memory=True, sampler=dict( length_property='modality_length', per_device_batch_size=1, type='xtuner.dataset.samplers.LengthGroupedSampler')) visual_encoder_name_or_path = '/home/aistudio/.cache/modelscope/hub/AI-ModelScope/clip-vit-large-patch14-336' visualizer = None warmup_ratio = 0.03 weight_decay = 0 work_dir = './work_dirs/llava_internlm2_chat_1_8b_qlora_clip_vit_large_p14_336_lora_e1_gpu8_finetune_copy'
07/29 22:44:31 - mmengine - WARNING - Failed to search registry with scope "mmengine" in the "builder" registry tree. As a workaround, the current "builder" registry in "xtuner" is used to build instance. This may cause unexpected failure when running the built modules. Please check whether "mmengine" is a correct scope, or whether the registry is initialized.
^C
Traceback (most recent call last):
File "/home/aistudio/.local/lib/python3.10/site-packages/xtuner/tools/train.py", line 360, in
- 我尝试在jupyter中:
- 重新安装xtuner;
- 安装openmim,
- 重新用绝对路径执行xtuner,
- 用%%魔术指令执行xtuner,
- 用`import os
os.environ['PATH'] += ':/home/aistudio/.local/bin'`
- 都没有任何效果.
- 最后我去掉了指令中的`deepspped`字样,也就是执行了`!/home/aistudio/.local/bin/xtuner train /home/aistudio/llava/llava_internlm2_chat_1_8b_qlora_clip_vit_large_p14_336_lora_e1_gpu8_finetune_copy.py`,训练正常执行了!
我遇到了相同的问题,我通过修改训练启动命令解决了.
- 我遇到了同一样报错
07/29 22:44:31 - mmengine - �[5m�[4m�[33mWARNING�[0m - Failed to search registry with scope "mmengine" in the "builder" registry tree. As a workaround, the current "builder" registry in "xtuner" is used to build instance. This may cause unexpected failure when running the built modules. Please check whether "mmengine" is a correct scope, or whether the registry is initialized.
.甚至于卡在此处50分钟一动不动,也没有GPU和VRAM占用.
interlm2-1_8b
而模板确是chat 的模板,后来我改过了了,但这跟出现该BUG无关.deepspeed
.探讨是否是xtuner
的deepspeed
有问题,还是相关的配置或者其他依赖导致了这个BUG.deepspeed
了,完全解决了问题.目前来看并不是因为显存不足,同时我在两张T4上能够正常启动DeepSpeed训练 怀疑是DeepSpeed安装的问题,建议您可以尝试使用命令ds_report检查一下是否有错误? 如果上述命令一切正常,可以尝试运行一些DeepSpeed官方提供的examples脚本,如DeepSpeed_CIFAR,验证DeepSpeed能否正常启动~
- 我测试了这个官方检测脚本,发现是
ninjia
没有识别.当我在jupyter中增加import os os.environ['PATH'] += ':/home/aistudio/.local/bin' # for ninja os.environ['PATH'] += ':/home/aistudio/.local/lib/python3.10/site-packages/ninja/data/bin'
之后,使用
deepspeed
训练便完全正常了! 附上命令!/home/aistudio/.local/bin/xtuner train \ /home/aistudio/llava/llava_internlm2_chat_1_8b_qlora_clip_vit_large_p14_336_lora_e1_gpu8_finetune_copy.py \ --deepspeed deepspeed_zero2
附上出现ETA时常输出的前后几行:You are using an old version of the checkpointing format that is deprecated (We will also silently ignore `gradient_checkpointing_kwargs` in case you passed it).Please update to the new format on your modeling file. To use the new format, you need to completely remove the definition of the method `_set_gradient_checkpointing` in your model. 07/30 13:58:09 - mmengine - WARNING - "FileClient" will be deprecated in future. Please use io functions in https://mmengine.readthedocs.io/en/latest/api/fileio.html#file-io 07/30 13:58:09 - mmengine - WARNING - "HardDiskBackend" is the alias of "LocalBackend" and the former will be deprecated in future. 07/30 13:58:09 - mmengine - INFO - Checkpoints will be saved to /home/aistudio/llava/work_dirs/llava_internlm2_chat_1_8b_qlora_clip_vit_large_p14_336_lora_e1_gpu8_finetune_copy. [2024-07-30 13:58:10,244] [INFO] [loss_scaler.py:190:update_scale] [deepspeed] OVERFLOW! Rank 0 Skipping step. Attempted loss scale: 65536, but hysteresis is 2. Reducing hysteresis to 1 [2024-07-30 13:58:10,904] [INFO] [loss_scaler.py:183:update_scale] [deepspeed] OVERFLOW! Rank 0 Skipping step. Attempted loss scale: 65536, reducing to 32768 07/30 13:58:17 - mmengine - INFO - Iter(train) [ 10/1200] lr: 5.1430e-05 eta: 0:15:16 time: 0.7705 data_time: 0.0138 memory: 18114 loss: 1.3578 07/30 13:58:24 - mmengine - INFO - Iter(train) [ 20/1200] lr: 1.0857e-04 eta: 0:15:06 time: 0.7654 data_time: 0.0144 memory: 18113 loss: 0.5315 07/30 13:58:32 - mmengine - INFO - Iter(train) [ 30/1200] lr: 1.6571e-04 eta: 0:15:00 time: 0.7736 data_time: 0.0147 memory: 18113 loss: 0.3846 07/30 13:58:39 - mmengine - INFO - Iter(train) [ 40/1200] lr: 2.0000e-04 eta: 0:14:42 time: 0.7338 data_time: 0.0147 memory: 18113 loss: 0.5993 07/30 13:58:47 - mmengine - INFO - Iter(train) [ 50/1200] lr: 1.9994e-04 eta: 0:14:22 time: 0.7087 data_time: 0.0144 memory: 18113 loss: 0.5120 07/30 13:58:54 - mmengine - INFO - Iter(train) [ 60/1200] lr: 1.9981e-04 eta: 0:14:20 time: 0.7770 data_time: 0.0145 memory: 18113 loss: 0.2459
我在训练时输出以下内容后,程序就停止了,请问这种情况该如何解决? `2024-05-15 09:29:44.939294: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered 2024-05-15 09:29:44.939347: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered 2024-05-15 09:29:44.940554: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered [2024-05-15 09:29:49,373] [INFO] [real_accelerator.py:158:get_accelerator] Setting ds_accelerator to cuda (auto detect) 2024-05-15 09:30:12.273661: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered 2024-05-15 09:30:12.273709: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered 2024-05-15 09:30:12.274819: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered [2024-05-15 09:30:16,168] [INFO] [real_accelerator.py:158:get_accelerator] Setting ds_accelerator to cuda (auto detect) 05/15 09:30:19 - mmengine - INFO -
System environment: sys.platform: linux Python: 3.10.12 (main, Jun 11 2023, 05:26:28) [GCC 11.4.0] CUDA available: True MUSA available: False numpy_random_seed: 1102040617 GPU 0: B1.gpu.medium CUDA_HOME: /usr/local/cuda NVCC: Cuda compilation tools, release 12.2, V12.2.140 GCC: x86_64-linux-gnu-gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0 PyTorch: 2.1.0a0+32f93b1 PyTorch compiling details: PyTorch built with:
Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=12.2, CUDNN_VERSION=8.9.5, CXX_COMPILER=/opt/rh/gcc-toolset-11/root/usr/bin/c++, CXX_FLAGS=-fno-gnu-unique -D_GLIBCXX_USE_CXX11_ABI=1 -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOROCTRACER -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -Wall -Wextra -Werror=return-type -Werror=non-virtual-dtor -Werror=range-loop-construct -Werror=bool-operation -Wnarrowing -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=old-style-cast -Wno-invalid-partial-specialization -Wno-unused-private-field -Wno-aligned-allocation-unavailable -Wno-missing-braces -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Werror=cast-function-type -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_DISABLE_GPU_ASSERTS=ON, TORCH_VERSION=2.1.0, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=1, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF,
TorchVision: 0.16.0a0 OpenCV: 4.7.0 MMEngine: 0.10.4
Runtime environment: cudnn_benchmark: False mp_cfg: {'mp_start_method': 'fork', 'opencv_num_threads': 0} dist_cfg: {'backend': 'nccl'} seed: 1102040617 deterministic: False Distributed launcher: none Distributed training: False GPU number: 1
05/15 09:30:19 - mmengine - INFO - Config: SYSTEM = 'xtuner.utils.SYSTEM_TEMPLATE.alpaca' accumulative_counts = 16 alpaca_en = dict( dataset=dict(path='./alpaca', type='datasets.load_dataset'), dataset_map_fn='xtuner.dataset.map_fns.alpaca_map_fn', max_length=2048, pack_to_max_length=True, remove_unused_columns=True, shuffle_before_pack=True, template_map_fn=dict( template='xtuner.utils.PROMPT_TEMPLATE.chatglm3', type='xtuner.dataset.map_fns.template_map_fn_factory'), tokenizer=dict( encode_special_tokens=True, padding_side='left', pretrained_model_name_or_path='/gemini/pretrain', trust_remote_code=True, type='transformers.AutoTokenizer.from_pretrained'), type='xtuner.dataset.process_hf_dataset', use_varlen_attn=False) alpaca_en_path = './alpaca' batch_size = 1 betas = ( 0.9, 0.999, ) custom_hooks = [ dict( tokenizer=dict( encode_special_tokens=True, padding_side='left', pretrained_model_name_or_path='/gemini/pretrain', trust_remote_code=True, type='transformers.AutoTokenizer.from_pretrained'), type='xtuner.engine.hooks.DatasetInfoHook'), dict( evaluation_inputs=[ '请给我介绍五个上海的景点', 'Please tell me five scenic spots in Shanghai', ], every_n_iters=500, prompt_template='xtuner.utils.PROMPT_TEMPLATE.chatglm3', system='xtuner.utils.SYSTEM_TEMPLATE.alpaca', tokenizer=dict( encode_special_tokens=True, padding_side='left', pretrained_model_name_or_path='/gemini/pretrain', trust_remote_code=True, type='transformers.AutoTokenizer.from_pretrained'), type='xtuner.engine.hooks.EvaluateChatHook'), ] dataloader_num_workers = 0 default_hooks = dict( checkpoint=dict( by_epoch=False, interval=500, max_keep_ckpts=2, type='mmengine.hooks.CheckpointHook'), logger=dict( interval=10, log_metric_by_epoch=False, type='mmengine.hooks.LoggerHook'), param_scheduler=dict(type='mmengine.hooks.ParamSchedulerHook'), sampler_seed=dict(type='mmengine.hooks.DistSamplerSeedHook'), timer=dict(type='mmengine.hooks.IterTimerHook')) env_cfg = dict( cudnn_benchmark=False, dist_cfg=dict(backend='nccl'), mp_cfg=dict(mp_start_method='fork', opencv_num_threads=0)) evaluation_freq = 500 evaluation_inputs = [ '请给我介绍五个上海的景点', 'Please tell me five scenic spots in Shanghai', ] launcher = 'none' load_from = None log_level = 'INFO' log_processor = dict(by_epoch=False) lr = 0.0002 max_epochs = 3 max_length = 2048 max_norm = 1 model = dict( llm=dict( pretrained_model_name_or_path='/gemini/pretrain', quantization_config=dict( bnb_4bit_compute_dtype='torch.float16', bnb_4bit_quant_type='nf4', bnb_4bit_use_double_quant=True, llm_int8_has_fp16_weight=False, llm_int8_threshold=6.0, load_in_4bit=True, load_in_8bit=False, type='transformers.BitsAndBytesConfig'), torch_dtype='torch.float16', trust_remote_code=True, type='transformers.AutoModelForCausalLM.from_pretrained'), lora=dict( bias='none', lora_alpha=16, lora_dropout=0.1, r=64, task_type='CAUSAL_LM', type='peft.LoraConfig'), type='xtuner.model.SupervisedFinetune', use_varlen_attn=False) optim_type = 'torch.optim.AdamW' optim_wrapper = dict( accumulative_counts=16, clip_grad=dict(error_if_nonfinite=False, max_norm=1), dtype='float16', loss_scale='dynamic', optimizer=dict( betas=( 0.9, 0.999, ), lr=0.0002, type='torch.optim.AdamW', weight_decay=0), type='mmengine.optim.AmpOptimWrapper') pack_to_max_length = True param_scheduler = [ dict( begin=0, by_epoch=True, convert_to_iter_based=True, end=0.09, start_factor=1e-05, type='mmengine.optim.LinearLR'), dict( begin=0.09, by_epoch=True, convert_to_iter_based=True, end=3, eta_min=0.0, type='mmengine.optim.CosineAnnealingLR'), ] pretrained_model_name_or_path = '/gemini/pretrain' prompt_template = 'xtuner.utils.PROMPT_TEMPLATE.chatglm3' randomness = dict(deterministic=False, seed=None) resume = False save_steps = 500 save_total_limit = 2 tokenizer = dict( encode_special_tokens=True, padding_side='left', pretrained_model_name_or_path='/gemini/pretrain', trust_remote_code=True, type='transformers.AutoTokenizer.from_pretrained') train_cfg = dict(max_epochs=3, type='xtuner.engine.runner.TrainLoop') train_dataloader = dict( batch_size=1, collate_fn=dict( type='xtuner.dataset.collate_fns.default_collate_fn', use_varlen_attn=False), dataset=dict( dataset=dict(path='./alpaca', type='datasets.load_dataset'), dataset_map_fn='xtuner.dataset.map_fns.alpaca_map_fn', max_length=2048, pack_to_max_length=True, remove_unused_columns=True, shuffle_before_pack=True, template_map_fn=dict( template='xtuner.utils.PROMPT_TEMPLATE.chatglm3', type='xtuner.dataset.map_fns.template_map_fn_factory'), tokenizer=dict( encode_special_tokens=True, padding_side='left', pretrained_model_name_or_path='/gemini/pretrain', trust_remote_code=True, type='transformers.AutoTokenizer.from_pretrained'), type='xtuner.dataset.process_hf_dataset', use_varlen_attn=False), num_workers=0, sampler=dict(shuffle=True, type='mmengine.dataset.DefaultSampler')) use_varlen_attn = False visualizer = None warmup_ratio = 0.03 weight_decay = 0 work_dir = './work_dirs/chatglm3_6b_base_qlora_alpaca_e3_copy'
quantization_config convert to <class 'transformers.utils.quantization_config.BitsAndBytesConfig'> 05/15 09:30:19 - mmengine - WARNING - Failed to search registry with scope "mmengine" in the "builder" registry tree. As a workaround, the current "builder" registry in "xtuner" is used to build instance. This may cause unexpected failure when running the built modules. Please check whether "mmengine" is a correct scope, or whether the registry is initialized.
low_cpu_mem_usage
was None, now set to True since model is quantized.`