wsl ubuntu 22.04：xtuner train : Error out of memory at line 380 in file /mmfs1/gscratch/zlab/timdettmers/git/bitsandbytes/csrc/pythonInterface.c

WSL Ubuntu 22.04

内存：56G 显卡：RTX 3080Ti 16G+RTX 4090 24G

conda activate xtuner0.1.9

cd /home/zhanghui/ft-oasst1

xtuner train ./internlm_chat_7b_qlora_oasst1_e3_copy.py --deepspeed deepspeed_zero2

(base) zhanghui@zhanghui:~$ conda activate xtuner0.1.9 cd ~/ft-oasst1 (xtuner0.1.9) zhanghui@zhanghui:~/ft-oasst1$ xtuner train ./internlm_chat_7b_qlora_oasst1_e3_copy.py --deepspeed deepspeed_zero2 [2024-01-16 02:14:21,512] [INFO] [real_accelerator.py:161:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2024-01-16 02:14:23,833] [INFO] [real_accelerator.py:161:get_accelerator] Setting ds_accelerator to cuda (auto detect) 01/16 02:14:24 - mmengine - INFO -

System environment: sys.platform: linux Python: 3.10.13 (main, Sep 11 2023, 13:44:35) [GCC 11.2.0] CUDA available: True numpy_random_seed: 1005804663 GPU 0: NVIDIA GeForce RTX 3080 Laptop GPU CUDA_HOME: /usr/local/cuda NVCC: Cuda compilation tools, release 11.6, V11.6.124 GCC: gcc (Ubuntu 9.5.0-1ubuntu1~22.04) 9.5.0 PyTorch: 2.1.2+cu121 PyTorch compiling details: PyTorch built with:

GCC 9.3
C++ Version: 201703
Intel(R) oneAPI Math Kernel Library Version 2022.2-Product Build 20220804 for Intel(R) 64 architecture applications
Intel(R) MKL-DNN v3.1.1 (Git Hash 64f6bcbcbab628e96f33a62c3e975f8535a7bde4)
OpenMP 201511 (a.k.a. OpenMP 4.5)
LAPACK is enabled (usually provided by MKL)
NNPACK is enabled
CPU capability usage: AVX512
CUDA Runtime 12.1
NVCC architecture flags: -gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_90,code=sm_90
CuDNN 8.5 (built against CUDA 11.7)
- Built with CuDNN 8.9.2
Magma 2.6.1
Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=12.1, CUDNN_VERSION=8.9.2, CXX_COMPILER=/opt/rh/devtoolset-9/root/usr/bin/c++, CXX_FLAGS= -D_GLIBCXX_USE_CXX11_ABI=0 -fabi-version=11 -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOROCTRACER -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -Wall -Wextra -Werror=return-type -Werror=non-virtual-dtor -Werror=bool-operation -Wnarrowing -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=old-style-cast -Wno-invalid-partial-specialization -Wno-unused-private-field -Wno-aligned-allocation-unavailable -Wno-missing-braces -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Werror=cast-function-type -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_DISABLE_GPU_ASSERTS=ON, TORCH_VERSION=2.1.2, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=1, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF,

OpenCV: 4.9.0 MMEngine: 0.10.2

Runtime environment: launcher: none randomness: {'seed': None, 'deterministic': False} cudnn_benchmark: False mp_cfg: {'mp_start_method': 'fork', 'opencv_num_threads': 0} dist_cfg: {'backend': 'nccl'} seed: None deterministic: False Distributed launcher: none Distributed training: False GPU number: 1

01/16 02:14:24 - mmengine - INFO - Config: SYSTEM = '' accumulative_counts = 16 batch_size = 1 betas = ( 0.9, 0.999, ) custom_hooks = [ dict( tokenizer=dict( padding_side='right', pretrained_model_name_or_path='./internlm-chat-7b', trust_remote_code=True, type='transformers.AutoTokenizer.from_pretrained'), type='xtuner.engine.DatasetInfoHook'), dict( evaluation_inputs=[ '请给我介绍五个上海的景点', 'Please tell me five scenic spots in Shanghai', ], every_n_iters=500, prompt_template='xtuner.utils.PROMPT_TEMPLATE.internlm_chat', system='', tokenizer=dict( padding_side='right', pretrained_model_name_or_path='./internlm-chat-7b', trust_remote_code=True, type='transformers.AutoTokenizer.from_pretrained'), type='xtuner.engine.EvaluateChatHook'), ] data_path = './openassistant-guanaco' dataloader_num_workers = 0 default_hooks = dict( checkpoint=dict(interval=1, type='mmengine.hooks.CheckpointHook'), logger=dict(interval=10, type='mmengine.hooks.LoggerHook'), param_scheduler=dict(type='mmengine.hooks.ParamSchedulerHook'), sampler_seed=dict(type='mmengine.hooks.DistSamplerSeedHook'), timer=dict(type='mmengine.hooks.IterTimerHook')) env_cfg = dict( cudnn_benchmark=False, dist_cfg=dict(backend='nccl'), mp_cfg=dict(mp_start_method='fork', opencv_num_threads=0)) evaluation_freq = 500 evaluation_inputs = [ '请给我介绍五个上海的景点', 'Please tell me five scenic spots in Shanghai', ] launcher = 'none' load_from = None log_level = 'INFO' lr = 0.0002 max_epochs = 3 max_length = 2048 max_norm = 1 model = dict( llm=dict( pretrained_model_name_or_path='./internlm-chat-7b', quantization_config=dict( bnb_4bit_compute_dtype='torch.float16', bnb_4bit_quant_type='nf4', bnb_4bit_use_double_quant=True, llm_int8_has_fp16_weight=False, llm_int8_threshold=6.0, load_in_4bit=True, load_in_8bit=False, type='transformers.BitsAndBytesConfig'), torch_dtype='torch.float16', trust_remote_code=True, type='transformers.AutoModelForCausalLM.from_pretrained'), lora=dict( bias='none', lora_alpha=16, lora_dropout=0.1, r=64, task_type='CAUSAL_LM', type='peft.LoraConfig'), type='xtuner.model.SupervisedFinetune') optim_type = 'bitsandbytes.optim.PagedAdamW32bit' optim_wrapper = dict( optimizer=dict( betas=( 0.9, 0.999, ), lr=0.0002, type='bitsandbytes.optim.PagedAdamW32bit', weight_decay=0), type='DeepSpeedOptimWrapper') pack_to_max_length = True param_scheduler = dict( T_max=3, by_epoch=True, convert_to_iter_based=True, eta_min=0.0, type='mmengine.optim.CosineAnnealingLR') pretrained_model_name_or_path = './internlm-chat-7b' prompt_template = 'xtuner.utils.PROMPT_TEMPLATE.internlm_chat' randomness = dict(deterministic=False, seed=None) resume = False runner_type = 'FlexibleRunner' strategy = dict( config=dict( bf16=dict(enabled=True), fp16=dict(enabled=False, initial_scale_power=16), gradient_accumulation_steps='auto', gradient_clipping='auto', train_micro_batch_size_per_gpu='auto', zero_allow_untested_optimizer=True, zero_force_ds_cpu_optimizer=False, zero_optimization=dict(overlap_comm=True, stage=2)), exclude_frozen_parameters=True, gradient_accumulation_steps=16, gradient_clipping=1, train_micro_batch_size_per_gpu=1, type='DeepSpeedStrategy') tokenizer = dict( padding_side='right', pretrained_model_name_or_path='./internlm-chat-7b', trust_remote_code=True, type='transformers.AutoTokenizer.from_pretrained') train_cfg = dict(by_epoch=True, max_epochs=3, val_interval=1) train_dataloader = dict( batch_size=1, collate_fn=dict(type='xtuner.dataset.collate_fns.default_collate_fn'), dataset=dict( dataset=dict( path='./openassistant-guanaco', type='datasets.load_dataset'), dataset_map_fn='xtuner.dataset.map_fns.oasst1_map_fn', max_length=2048, pack_to_max_length=True, remove_unused_columns=True, shuffle_before_pack=True, template_map_fn=dict( template='xtuner.utils.PROMPT_TEMPLATE.internlm_chat', type='xtuner.dataset.map_fns.template_map_fn_factory'), tokenizer=dict( padding_side='right', pretrained_model_name_or_path='./internlm-chat-7b', trust_remote_code=True, type='transformers.AutoTokenizer.from_pretrained'), type='xtuner.dataset.process_hf_dataset'), num_workers=0, sampler=dict(shuffle=True, type='mmengine.dataset.DefaultSampler')) train_dataset = dict( dataset=dict(path='./openassistant-guanaco', type='datasets.load_dataset'), dataset_map_fn='xtuner.dataset.map_fns.oasst1_map_fn', max_length=2048, pack_to_max_length=True, remove_unused_columns=True, shuffle_before_pack=True, template_map_fn=dict( template='xtuner.utils.PROMPT_TEMPLATE.internlm_chat', type='xtuner.dataset.map_fns.template_map_fn_factory'), tokenizer=dict( padding_side='right', pretrained_model_name_or_path='./internlm-chat-7b', trust_remote_code=True, type='transformers.AutoTokenizer.from_pretrained'), type='xtuner.dataset.process_hf_dataset') visualizer = None weight_decay = 0 work_dir = './work_dirs/internlm_chat_7b_qlora_oasst1_e3_copy'

01/16 02:14:25 - mmengine - WARNING - Failed to search registry with scope "mmengine" in the "builder" registry tree. As a workaround, the current "builder" registry in "xtuner" is used to build instance. This may cause unexpected failure when running the built modules. Please check whether "mmengine" is a correct scope, or whether the registry is initialized. 01/16 02:14:25 - mmengine - INFO - Hooks will be executed in the following order: before_run: (VERY_HIGH ) RuntimeInfoHook (BELOW_NORMAL) LoggerHook

before_train: (VERY_HIGH ) RuntimeInfoHook (NORMAL ) IterTimerHook (NORMAL ) DatasetInfoHook (NORMAL ) EvaluateChatHook (VERY_LOW ) CheckpointHook

before_train_epoch: (VERY_HIGH ) RuntimeInfoHook (NORMAL ) IterTimerHook (NORMAL ) DistSamplerSeedHook

before_train_iter: (VERY_HIGH ) RuntimeInfoHook (NORMAL ) IterTimerHook

after_train_iter: (VERY_HIGH ) RuntimeInfoHook (NORMAL ) IterTimerHook (NORMAL ) EvaluateChatHook (BELOW_NORMAL) LoggerHook (LOW ) ParamSchedulerHook (VERY_LOW ) CheckpointHook

after_train_epoch: (NORMAL ) IterTimerHook (LOW ) ParamSchedulerHook (VERY_LOW ) CheckpointHook

before_val: (VERY_HIGH ) RuntimeInfoHook (NORMAL ) DatasetInfoHook

before_val_epoch: (NORMAL ) IterTimerHook

before_val_iter: (NORMAL ) IterTimerHook

after_val_iter: (NORMAL ) IterTimerHook (BELOW_NORMAL) LoggerHook

after_val_epoch: (VERY_HIGH ) RuntimeInfoHook (NORMAL ) IterTimerHook (BELOW_NORMAL) LoggerHook (LOW ) ParamSchedulerHook (VERY_LOW ) CheckpointHook

after_val: (VERY_HIGH ) RuntimeInfoHook (NORMAL ) EvaluateChatHook

after_train: (VERY_HIGH ) RuntimeInfoHook (NORMAL ) EvaluateChatHook (VERY_LOW ) CheckpointHook

before_test: (VERY_HIGH ) RuntimeInfoHook (NORMAL ) DatasetInfoHook

before_test_epoch: (NORMAL ) IterTimerHook

before_test_iter: (NORMAL ) IterTimerHook

after_test_iter: (NORMAL ) IterTimerHook (BELOW_NORMAL) LoggerHook

after_test_epoch: (VERY_HIGH ) RuntimeInfoHook (NORMAL ) IterTimerHook (BELOW_NORMAL) LoggerHook

after_test: (VERY_HIGH ) RuntimeInfoHook

after_run: (BELOW_NORMAL) LoggerHook

Repo card metadata block was not found. Setting CardData to empty. Flattening the indices: 100%|████████████████████████████████████████████████████████████████| 9846/9846 [00:00<00:00, 32808.04 examples/s]Map: Map: 100%|████████████████████████████████████████████████████████████████████████████████████| 9846/9846 [00:02<00:00, 3370.91 examples/s] 01/16 02:14:29 - mmengine - WARNING - Dataset Dataset has no metainfo. dataset_meta in visualizer will be None. quantization_config convert to <class 'transformers.utils.quantization_config.BitsAndBytesConfig'> Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████| 8/8 [00:11<00:00, 1.43s/it] 01/16 02:14:41 - mmengine - INFO - dispatch internlm attn forward 01/16 02:14:41 - mmengine - WARNING - Due to the implementation of the PyTorch version of flash attention, even when the output_attentions flag is set to True, it is not possible to return the attn_weights. [2024-01-16 02:14:42,820] [INFO] [logging.py:96:log_dist] [Rank -1] DeepSpeed info: version=0.12.6, git-hash=unknown, git-branch=unknown [2024-01-16 02:14:42,820] [INFO] [comm.py:637:init_distributed] cdb=None [2024-01-16 02:14:42,820] [INFO] [comm.py:652:init_distributed] Not using the DeepSpeed or dist launchers, attempting to detect MPI environment... [2024-01-16 02:14:42,839] [INFO] [comm.py:702:mpi_discovery] Discovered MPI settings of world_rank=0, local_rank=0, world_size=1, master_addr=172.21.53.212, master_port=29500 [2024-01-16 02:14:42,839] [INFO] [comm.py:668:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl [2024-01-16 02:14:43,535] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed Flops Profiler Enabled: False [2024-01-16 02:14:43,538] [INFO] [logging.py:96:log_dist] [Rank 0] Using client Optimizer as basic optimizer [2024-01-16 02:14:43,538] [INFO] [logging.py:96:log_dist] [Rank 0] Removing param_group that has no 'params' in the basic Optimizer [2024-01-16 02:14:43,598] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed Basic Optimizer = PagedAdamW32bit [2024-01-16 02:14:43,599] [INFO] [utils.py:56:is_zero_supported_optimizer] Checking ZeRO support for optimizer=PagedAdamW32bit type=<class 'bitsandbytes.optim.adamw.PagedAdamW32bit'> [2024-01-16 02:14:43,599] [WARNING] [engine.py:1166:_do_optimizer_sanity_check] ** You are using ZeRO with an untested optimizer, proceed with caution *** [2024-01-16 02:14:43,599] [INFO] [logging.py:96:log_dist] [Rank 0] Creating torch.bfloat16 ZeRO stage 2 optimizer [2024-01-16 02:14:43,599] [INFO] [stage_1_and_2.py:148:init] Reduce bucket size 500,000,000 [2024-01-16 02:14:43,599] [INFO] [stage_1_and_2.py:149:init] Allgather bucket size 500,000,000 [2024-01-16 02:14:43,599] [INFO] [stage_1_and_2.py:150:init] CPU Offload: False [2024-01-16 02:14:43,599] [INFO] [stage_1_and_2.py:151:init] Round robin gradient partitioning: False [2024-01-16 02:14:43,936] [INFO] [utils.py:791:see_memory_usage] Before initializing optimizer states [2024-01-16 02:14:43,936] [INFO] [utils.py:792:see_memory_usage] MA 5.63 GB Max_MA 5.93 GB CA 6.32 GB Max_CA 6 GB [2024-01-16 02:14:43,937] [INFO] [utils.py:799:see_memory_usage] CPU Virtual Memory: used = 2.49 GB, percent = 4.5% Error out of memory at line 380 in file /mmfs1/gscratch/zlab/timdettmers/git/bitsandbytes/csrc/pythonInterface.c

同样的代码在裸机ubuntu 22.04上即便是32G内存也能正常运行。

如果去掉deepspeed加速：

会在跑了10个后也报内存不足： xtuner train ./internlm_chat_7b_qlora_oasst1_e3_copy.py

(xtuner0.1.9) zhanghui@zhanghui:~/ft-oasst1$ xtuner train ./internlm_chat_7b_qlora_oasst1_e3_copy.py [2024-01-16 02:25:22,354] [INFO] [real_accelerator.py:161:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2024-01-16 02:25:24,512] [INFO] [real_accelerator.py:161:get_accelerator] Setting ds_accelerator to cuda (auto detect) 01/16 02:25:25 - mmengine - INFO -

System environment: sys.platform: linux Python: 3.10.13 (main, Sep 11 2023, 13:44:35) [GCC 11.2.0] CUDA available: True numpy_random_seed: 346509095 GPU 0: NVIDIA GeForce RTX 3080 Laptop GPU CUDA_HOME: /usr/local/cuda NVCC: Cuda compilation tools, release 11.6, V11.6.124 GCC: gcc (Ubuntu 9.5.0-1ubuntu1~22.04) 9.5.0 PyTorch: 2.1.2+cu121 PyTorch compiling details: PyTorch built with:

GCC 9.3
C++ Version: 201703
Intel(R) oneAPI Math Kernel Library Version 2022.2-Product Build 20220804 for Intel(R) 64 architecture applications
Intel(R) MKL-DNN v3.1.1 (Git Hash 64f6bcbcbab628e96f33a62c3e975f8535a7bde4)
OpenMP 201511 (a.k.a. OpenMP 4.5)
LAPACK is enabled (usually provided by MKL)
NNPACK is enabled
CPU capability usage: AVX512
CUDA Runtime 12.1
NVCC architecture flags: -gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_90,code=sm_90
CuDNN 8.5 (built against CUDA 11.7)
- Built with CuDNN 8.9.2
Magma 2.6.1
Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=12.1, CUDNN_VERSION=8.9.2, CXX_COMPILER=/opt/rh/devtoolset-9/root/usr/bin/c++, CXX_FLAGS= -D_GLIBCXX_USE_CXX11_ABI=0 -fabi-version=11 -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOROCTRACER -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -Wall -Wextra -Werror=return-type -Werror=non-virtual-dtor -Werror=bool-operation -Wnarrowing -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=old-style-cast -Wno-invalid-partial-specialization -Wno-unused-private-field -Wno-aligned-allocation-unavailable -Wno-missing-braces -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Werror=cast-function-type -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_DISABLE_GPU_ASSERTS=ON, TORCH_VERSION=2.1.2, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=1, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF,

OpenCV: 4.9.0 MMEngine: 0.10.2

Runtime environment: cudnn_benchmark: False mp_cfg: {'mp_start_method': 'fork', 'opencv_num_threads': 0} dist_cfg: {'backend': 'nccl'} seed: 346509095 deterministic: False Distributed launcher: none Distributed training: False GPU number: 1

01/16 02:25:25 - mmengine - INFO - Config: SYSTEM = '' accumulative_counts = 16 batch_size = 1 betas = ( 0.9, 0.999, ) custom_hooks = [ dict( tokenizer=dict( padding_side='right', pretrained_model_name_or_path='./internlm-chat-7b', trust_remote_code=True, type='transformers.AutoTokenizer.from_pretrained'), type='xtuner.engine.DatasetInfoHook'), dict( evaluation_inputs=[ '请给我介绍五个上海的景点', 'Please tell me five scenic spots in Shanghai', ], every_n_iters=500, prompt_template='xtuner.utils.PROMPT_TEMPLATE.internlm_chat', system='', tokenizer=dict( padding_side='right', pretrained_model_name_or_path='./internlm-chat-7b', trust_remote_code=True, type='transformers.AutoTokenizer.from_pretrained'), type='xtuner.engine.EvaluateChatHook'), ] data_path = './openassistant-guanaco' dataloader_num_workers = 0 default_hooks = dict( checkpoint=dict(interval=1, type='mmengine.hooks.CheckpointHook'), logger=dict(interval=10, type='mmengine.hooks.LoggerHook'), param_scheduler=dict(type='mmengine.hooks.ParamSchedulerHook'), sampler_seed=dict(type='mmengine.hooks.DistSamplerSeedHook'), timer=dict(type='mmengine.hooks.IterTimerHook')) env_cfg = dict( cudnn_benchmark=False, dist_cfg=dict(backend='nccl'), mp_cfg=dict(mp_start_method='fork', opencv_num_threads=0)) evaluation_freq = 500 evaluation_inputs = [ '请给我介绍五个上海的景点', 'Please tell me five scenic spots in Shanghai', ] launcher = 'none' load_from = None log_level = 'INFO' lr = 0.0002 max_epochs = 3 max_length = 2048 max_norm = 1 model = dict( llm=dict( pretrained_model_name_or_path='./internlm-chat-7b', quantization_config=dict( bnb_4bit_compute_dtype='torch.float16', bnb_4bit_quant_type='nf4', bnb_4bit_use_double_quant=True, llm_int8_has_fp16_weight=False, llm_int8_threshold=6.0, load_in_4bit=True, load_in_8bit=False, type='transformers.BitsAndBytesConfig'), torch_dtype='torch.float16', trust_remote_code=True, type='transformers.AutoModelForCausalLM.from_pretrained'), lora=dict( bias='none', lora_alpha=16, lora_dropout=0.1, r=64, task_type='CAUSAL_LM', type='peft.LoraConfig'), type='xtuner.model.SupervisedFinetune') optim_type = 'bitsandbytes.optim.PagedAdamW32bit' optim_wrapper = dict( accumulative_counts=16, clip_grad=dict(error_if_nonfinite=False, max_norm=1), dtype='float16', loss_scale='dynamic', optimizer=dict( betas=( 0.9, 0.999, ), lr=0.0002, type='bitsandbytes.optim.PagedAdamW32bit', weight_decay=0), type='mmengine.optim.AmpOptimWrapper') pack_to_max_length = True param_scheduler = dict( T_max=3, by_epoch=True, convert_to_iter_based=True, eta_min=0.0, type='mmengine.optim.CosineAnnealingLR') pretrained_model_name_or_path = './internlm-chat-7b' prompt_template = 'xtuner.utils.PROMPT_TEMPLATE.internlm_chat' randomness = dict(deterministic=False, seed=None) resume = False tokenizer = dict( padding_side='right', pretrained_model_name_or_path='./internlm-chat-7b', trust_remote_code=True, type='transformers.AutoTokenizer.from_pretrained') train_cfg = dict(by_epoch=True, max_epochs=3, val_interval=1) train_dataloader = dict( batch_size=1, collate_fn=dict(type='xtuner.dataset.collate_fns.default_collate_fn'), dataset=dict( dataset=dict( path='./openassistant-guanaco', type='datasets.load_dataset'), dataset_map_fn='xtuner.dataset.map_fns.oasst1_map_fn', max_length=2048, pack_to_max_length=True, remove_unused_columns=True, shuffle_before_pack=True, template_map_fn=dict( template='xtuner.utils.PROMPT_TEMPLATE.internlm_chat', type='xtuner.dataset.map_fns.template_map_fn_factory'), tokenizer=dict( padding_side='right', pretrained_model_name_or_path='./internlm-chat-7b', trust_remote_code=True, type='transformers.AutoTokenizer.from_pretrained'), type='xtuner.dataset.process_hf_dataset'), num_workers=0, sampler=dict(shuffle=True, type='mmengine.dataset.DefaultSampler')) train_dataset = dict( dataset=dict(path='./openassistant-guanaco', type='datasets.load_dataset'), dataset_map_fn='xtuner.dataset.map_fns.oasst1_map_fn', max_length=2048, pack_to_max_length=True, remove_unused_columns=True, shuffle_before_pack=True, template_map_fn=dict( template='xtuner.utils.PROMPT_TEMPLATE.internlm_chat', type='xtuner.dataset.map_fns.template_map_fn_factory'), tokenizer=dict( padding_side='right', pretrained_model_name_or_path='./internlm-chat-7b', trust_remote_code=True, type='transformers.AutoTokenizer.from_pretrained'), type='xtuner.dataset.process_hf_dataset') visualizer = None weight_decay = 0 work_dir = './work_dirs/internlm_chat_7b_qlora_oasst1_e3_copy'

quantization_config convert to <class 'transformers.utils.quantization_config.BitsAndBytesConfig'> 01/16 02:25:25 - mmengine - WARNING - Failed to search registry with scope "mmengine" in the "builder" registry tree. As a workaround, the current "builder" registry in "xtuner" is used to build instance. This may cause unexpected failure when running the built modules. Please check whether "mmengine" is a correct scope, or whether the registry is initialized. Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████| 8/8 [00:05<00:00, 1.44it/s] 01/16 02:25:31 - mmengine - INFO - dispatch internlm attn forward 01/16 02:25:31 - mmengine - WARNING - Due to the implementation of the PyTorch version of flash attention, even when the `output_attentions` flag is set to True, it is not possible to return the `attn_weights`. 01/16 02:25:33 - mmengine - INFO - Distributed training is not used, all SyncBatchNorm (SyncBN) layers in the model will be automatically reverted to BatchNormXd layers if they are used. 01/16 02:25:33 - mmengine - INFO - Hooks will be executed in the following order: before_run: (VERY_HIGH ) RuntimeInfoHook (BELOW_NORMAL) LoggerHook

before_train: (VERY_HIGH ) RuntimeInfoHook (NORMAL ) IterTimerHook (NORMAL ) DatasetInfoHook (NORMAL ) EvaluateChatHook (VERY_LOW ) CheckpointHook

before_train_epoch: (VERY_HIGH ) RuntimeInfoHook (NORMAL ) IterTimerHook (NORMAL ) DistSamplerSeedHook

before_train_iter: (VERY_HIGH ) RuntimeInfoHook (NORMAL ) IterTimerHook

after_train_iter: (VERY_HIGH ) RuntimeInfoHook (NORMAL ) IterTimerHook (NORMAL ) EvaluateChatHook (BELOW_NORMAL) LoggerHook (LOW ) ParamSchedulerHook (VERY_LOW ) CheckpointHook

after_train_epoch: (NORMAL ) IterTimerHook (LOW ) ParamSchedulerHook (VERY_LOW ) CheckpointHook

before_val: (VERY_HIGH ) RuntimeInfoHook (NORMAL ) DatasetInfoHook

before_val_epoch: (NORMAL ) IterTimerHook

before_val_iter: (NORMAL ) IterTimerHook

after_val_iter: (NORMAL ) IterTimerHook (BELOW_NORMAL) LoggerHook

after_val_epoch: (VERY_HIGH ) RuntimeInfoHook (NORMAL ) IterTimerHook (BELOW_NORMAL) LoggerHook (LOW ) ParamSchedulerHook (VERY_LOW ) CheckpointHook

after_val: (VERY_HIGH ) RuntimeInfoHook (NORMAL ) EvaluateChatHook

after_train: (VERY_HIGH ) RuntimeInfoHook (NORMAL ) EvaluateChatHook (VERY_LOW ) CheckpointHook

before_test: (VERY_HIGH ) RuntimeInfoHook (NORMAL ) DatasetInfoHook

before_test_epoch: (NORMAL ) IterTimerHook

before_test_iter: (NORMAL ) IterTimerHook

after_test_iter: (NORMAL ) IterTimerHook (BELOW_NORMAL) LoggerHook

after_test_epoch: (VERY_HIGH ) RuntimeInfoHook (NORMAL ) IterTimerHook (BELOW_NORMAL) LoggerHook

after_test: (VERY_HIGH ) RuntimeInfoHook

after_run: (BELOW_NORMAL) LoggerHook

Repo card metadata block was not found. Setting CardData to empty. Flattening the indices: 100%|█████████████████████████████████████████████████████████████████████| 9846/9846 [00:00<00:00, 32582.97 examples/s] Map: 100%|█████████████████████████████████████████████████████████████████████████████████████████| 9846/9846 [00:02<00:00, 3398.94 examples/s] 01/16 02:25:37 - mmengine - WARNING - Dataset Dataset has no metainfo. dataset_meta in visualizer will be None. 01/16 02:25:37 - mmengine - INFO - Num train samples 2180 01/16 02:25:37 - mmengine - INFO - train example: 01/16 02:25:37 - mmengine - INFO - |User|:¿Quién es Ada Lovelace? |Bot|:Ada Lovelace (1815-1852) fue una matemática y escritora británica, considerada la primera programadora de la historia. Fue hija del poeta Lord Byron (1788-1824) y de la matemática Anna Isabella Noel Byron (1792-1860)

Desde temprana edad, Ada Lovelace mostró una gran habilidad para las matemáticas, y su madre la educó en este campo para alejarla de la influencia de su padre, que había abandonado a la familia cuando ella tenía solo un mes de edad.

En 1843, Ada Lovelace trabajó con el matemático y científico Charles Babbage en el diseño de la "Máquina Analítica", una máquina capaz de realizar cálculos matemáticos complejos mediante la programación de tarjetas perforadas. Ada Lovelace fue la encargada de escribir un algoritmo para calcular los números de Bernoulli utilizando la Máquina Analítica, lo que la convierte en la primera programadora de la historia.

La importancia de Ada Lovelace radica en que fue la primera persona en darse cuenta de que la Máquina Analítica no solo podía realizar cálculos matemáticos, sino que también podía procesar información no numérica, lo que la convertiría en una herramienta de gran utilidad para la creación de todo tipo de programas y aplicaciones.

Además de su trabajo en matemáticas y programación, Ada Lovelace también fue una destacada escritora y poeta, y se interesó por temas como la filosofía, la ciencia y la tecnología. Falleció a los 36 años debido a un cáncer uterino. |User|:What is Fulgurite and where can it be found? Is it valuable? |Bot|:Fulgurites are a rare type of natural glass that forms from the fusion of silica sand or soil with high heat generated by lightning strikes. It is essentially a tube-like structure that has a glassy exterior and a porous interior. Fulgurites can be found in areas with sandy soils and can occur anywhere in the world where lightning strikes. They are most commonly found in deserts and beach areas.

In terms of value, fulgurites can be considered valuable to mineral collectors and enthusiasts due to their unique formation and rarity. Their value can range widely depending on factors such as size, shape, and the overall quality of the specimen. However, fulgurites are not considered valuable in terms of their monetary value, as they are not commonly used in industrial or commercial applications. |User|:Compose a professional email with the following points:

Me chinese cook 10 years Good good cook People love my food Can you hire me? <|Bot|>:Thanks for the information. Unfortunately, your initial message does not contain enough details or structure to compose a compelling professional email to a potential employer. Here are some suggestions for how we can improve the message:

1) Let's first clarify the type of cooking position you are interested in. Are you looking for a job as a line cook, sous chef, executive chef, or something else? This will help determine how to emphasize relevant experience and skills.

2) What specific experience do you have? For example, "Over 10 years of experience preparing authentic Sichuan and Cantonese cuisine." Highlight any relevant skills, like knife skills, cooking techniques, menu development, etc. Mention the types of cuisine you specialize in.

3) Have you worked in a professional kitchen before? If so, mention the name and type of restaurant (upscale, casual dining, etc.). Emphasize relevant responsibilities and accomplishments. If not, that's fine too, just be clear that you're looking to transition into a professional kitchen and highlight related skills and passion for cooking.

4) Do you have any certifications or training? Note any culinary degrees, certificates, or other credentials. Even informal training can show your dedication to the craft.

5) Are you looking for full-time, part-time, temporary, or contract work? Be specific about the type of opportunity you're seeking.

With the answers to these questions, we can craft an email in a standard business letter format. The opening can express your interest in the position and summarize your relevant experience and qualifications. The body can go into more depth on your experience, skills, and goals. The closing can reiterate your interest and mention that you'd welcome the opportunity to further discuss the role.

Does this help? Let me know if you would like help crafting the full email. I'm happy to assist you in composing a professional and compelling message to potential employers.|User|:Me podrías indicar los habitantes que tiene España en el año 2023? |Bot|:La población de España al 12 de febrero de 2023 contabilizaba 46444832. Es decir, casi 46 millones y medio de habitantes. Sin embargo esto es un dato aproximado. |User|:¿Cuantos días tiene un año bisiesto? |Bot|:Un año bisiesto posee 366 días, esto un día más que un año regular, ya que en los años bisiestos el mes de febrero posee un día más, dando como resultado esta diferencia.|User|:Necesito crear en javascript un algoritmo y lo embeba en un html llamado index.html que emule el juego tamagochi. Que el mismo tenga imágenes del personaje. Se le pueda poner un nombre y que permita darle de comer y cuidarlo. |Bot|:Supongo que lo que quieres es que te explique cómo hacerlo, así que aquí tienes una explicación de cómo podrías hacerlo:

1.-Si lo que quieres es hacer una mascota virtual, a la que haya que dar de comer, deberías crear distintas variables que representen distintos aspectos de ésta, como su sueño, su hambre, y hacer una distinta para cada propiedad que tenga tu imitación de tamagochi. 2.-Para empezar, deberás hacer que con el tiempo se vaya sumando o restando una constante a las variables que definan el estado de la mascota, como por ejemplo, que cada minuto aumente su hambre en un 1%. 3.-También deberás hacer que el jugador pueda realizar distintas acciones, que fijen una variable o le añadan un valor. Un ejemplo sería, hacer que se le pueda poner a dormir para que cada minuto que pase se le baje el sueño en un 2%.

Si necesitas que te ayude con cualquier otra cosa, pídemelo y te responderé lo mejor que pueda. |User|:Quais são as características de um bom jogador de poker |Bot|:Um bom jogador de poker possui diversas habilidades e características que o ajudam a ter sucesso no jogo. Algumas das mais importantes incluem:

Paciência: Um bom jogador sabe esperar pelas melhores oportunidades e não se precipita em jogadas arriscadas sem necessidade.
Disciplina: Seguir uma estratégia consistente e saber quando desistir de uma mão é crucial para o sucesso a longo prazo.
Habilidade matemática: Entender as probabilidades e a matemática envolvida no poker é essencial para tomar decisões informadas.
Leitura de adversários: Ser capaz de avaliar as ações e reações dos outros jogadores na mesa, prever suas jogadas e identificar possíveis blefes.
Gerenciamento de banca: Um bom jogador sabe gerenciar seu dinheiro e não se arrisca a perder tudo em uma única sessão de jogo.
Controle emocional: Manter a calma e o equilíbrio emocional em situações adversas é fundamental para tomar decisões racionais e não se deixar levar pela frustração ou euforia.
Adaptabilidade: Um jogador de poker bem-sucedido deve ser capaz de ajustar sua estrat 01/16 02:25:37 - mmengine - INFO - before_train in EvaluateChatHook. 01/16 02:25:42 - mmengine - INFO - Sample output: |User|:请给我介绍五个上海的景点 |Bot|:1. 上海博物馆：这是一座集收藏、展示、研究为一体的综合性博物馆，展示了上海的历史、文化和艺术品。
上海外滩：这是上海的标志性景点，可以欣赏到黄浦江两岸的美丽景色，

01/16 02:25:46 - mmengine - INFO - Sample output: |User|:Please tell me five scenic spots in Shanghai |Bot:1. The Bund: A famous waterfront promenade that offers stunning views of the city's skyline and the Huangpu River.

Yu Garden: A traditional Chinese garden that dates back to the Ming Dynasty, featuring beautiful pavil

01/16 02:25:46 - mmengine - WARNING - "FileClient" will be deprecated in future. Please use io functions in https://mmengine.readthedocs.io/en/latest/api/fileio.html#file-io 01/16 02:25:46 - mmengine - WARNING - "HardDiskBackend" is the alias of "LocalBackend" and the former will be deprecated in future. 01/16 02:25:46 - mmengine - INFO - Checkpoints will be saved to /home/zhanghui/ft-oasst1/work_dirs/internlm_chat_7b_qlora_oasst1_e3_copy. /home/zhanghui/anaconda3/envs/xtuner0.1.9/lib/python3.10/site-packages/torch/utils/checkpoint.py:429: UserWarning: torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants. warnings.warn( /home/zhanghui/anaconda3/envs/xtuner0.1.9/lib/python3.10/site-packages/mmengine/optim/scheduler/param_scheduler.py:198: UserWarning: Detected call of scheduler.step() before optimizer.step(). In PyTorch 1.1.0 and later, you should call them in the opposite order: optimizer.step() before scheduler.step(). Failure to do this will result in PyTorch skipping the first value of the parameter value schedule. See more details at https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate warnings.warn( 01/16 02:26:44 - mmengine - INFO - Epoch(train) [1][ 10/2180] lr: 2.0000e-04 eta: 10:31:38 time: 5.8037 data_time: 0.0043 memory: 11667 loss: 1.4122 Error out of memory at line 380 in file /mmfs1/gscratch/zlab/timdettmers/git/bitsandbytes/csrc/pythonInterface.c (xtuner0.1.9) zhanghui@zhanghui:~/ft-oasst1$

InternLM / xtuner