Hello, I was trying to run instruction tuning of CodeT5+ and encountered this issue. The error message is

(cjy_ct5) nlpir@nlpir-SYS-4028GR-TR:~/cjy/CodeT5/CodeT5+$ sh instruct_finetune.sh Using CUDA version: nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2022 NVIDIA Corporation Built on Tue_May__3_18:49:52_PDT_2022 Cuda compilation tools, release 11.7, V11.7.64 Build cuda_11.7.r11.7/compiler.31294372_0 [2024-05-28 20:40:23,112] [INFO] [real_accelerator.py:203:get_accelerator] Setting ds_accelerator to cuda (auto detect) [WARNING] async_io requires the dev libaio .so object and headers but these were not found. [WARNING] async_io: please install the libaio-dev package with apt [WARNING] If libaio is already installed (perhaps from source), try setting the CFLAGS and LDFLAGS environment variables to where it can be found. [WARNING] Please specify the CUTLASS repo directory as environment variable $CUTLASS_PATH [WARNING] NVIDIA Inference is only supported on Ampere and newer architectures [WARNING] please install triton==1.0.0 if you want to use sparse attention [2024-05-28 20:40:25,339] [WARNING] [runner.py:202:fetch_hostfile] Unable to find hostfile, will proceed with training with local resources only. [2024-05-28 20:40:25,339] [INFO] [runner.py:568:main] cmd = /home/nlpir/miniconda3/envs/cjy_ct5/bin/python -u -m deepspeed.launcher.launch --world_info=eyJsb2NhbGhvc3QiOiBbNiwgN119 --master_addr=127.0.0.1 --master_port=29500 --enable_each_rank_log=None instruct_tune_codet5p.py --load baselines/codet5p-220m --save-dir saved_models/instructcodet5p-220m --instruct-data-path datasets/code_alpaca_20k.json --fp16 --deepspeed deepspeed_config.json [2024-05-28 20:40:26,653] [INFO] [real_accelerator.py:203:get_accelerator] Setting ds_accelerator to cuda (auto detect) [WARNING] async_io requires the dev libaio .so object and headers but these were not found. [WARNING] async_io: please install the libaio-dev package with apt [WARNING] If libaio is already installed (perhaps from source), try setting the CFLAGS and LDFLAGS environment variables to where it can be found. [WARNING] Please specify the CUTLASS repo directory as environment variable $CUTLASS_PATH [WARNING] NVIDIA Inference is only supported on Ampere and newer architectures [WARNING] please install triton==1.0.0 if you want to use sparse attention [2024-05-28 20:40:28,850] [INFO] [launch.py:146:main] WORLD INFO DICT: {'localhost': [6, 7]} [2024-05-28 20:40:28,850] [INFO] [launch.py:152:main] nnodes=1, num_local_procs=2, node_rank=0 [2024-05-28 20:40:28,850] [INFO] [launch.py:163:main] global_rank_mapping=defaultdict(<class 'list'>, {'localhost': [0, 1]}) [2024-05-28 20:40:28,850] [INFO] [launch.py:164:main] dist_world_size=2 [2024-05-28 20:40:28,850] [INFO] [launch.py:168:main] Setting CUDA_VISIBLE_DEVICES=6,7 [2024-05-28 20:40:28,860] [INFO] [launch.py:256:main] process 2307758 spawned with command: ['/home/nlpir/miniconda3/envs/cjy_ct5/bin/python', '-u', 'instruct_tune_codet5p.py', '--local_rank=0', '--load', 'baselines/codet5p-220m', '--save-dir', 'saved_models/instructcodet5p-220m', '--instruct-data-path', 'datasets/code_alpaca_20k.json', '--fp16', '--deepspeed', 'deepspeed_config.json'] [2024-05-28 20:40:28,867] [INFO] [launch.py:256:main] process 2307759 spawned with command: ['/home/nlpir/miniconda3/envs/cjy_ct5/bin/python', '-u', 'instruct_tune_codet5p.py', '--local_rank=1', '--load', 'baselines/codet5p-220m', '--save-dir', 'saved_models/instructcodet5p-220m', '--instruct-data-path', 'datasets/code_alpaca_20k.json', '--fp16', '--deepspeed', 'deepspeed_config.json'] {'batch_size_per_replica': 1, 'cache_data': 'cache_data/instructions', 'data_num': -1, 'deepspeed': 'deepspeed_config.json', 'epochs': 3, 'fp16': True, 'grad_acc_steps': 16, 'instruct_data_path': 'datasets/code_alpaca_20k.json', 'load': 'baselines/codet5p-220m', 'local_rank': 1, 'log_freq': 10, 'lr': 2e-05, 'lr_warmup_steps': 30, 'max_len': 512, 'save_dir': 'saved_models/instructcodet5p-220m', 'save_freq': 500} ==> Loaded 20022 samples {'batch_size_per_replica': 1, 'cache_data': 'cache_data/instructions', 'data_num': -1, 'deepspeed': 'deepspeed_config.json', 'epochs': 3, 'fp16': True, 'grad_acc_steps': 16, 'instruct_data_path': 'datasets/code_alpaca_20k.json', 'load': 'baselines/codet5p-220m', 'local_rank': 0, 'log_freq': 10, 'lr': 2e-05, 'lr_warmup_steps': 30, 'max_len': 512, 'save_dir': 'saved_models/instructcodet5p-220m', 'save_freq': 500} ==> Loaded 20022 samples ==> Loaded model from baselines/codet5p-220m, model size 222882048 Para before freezing: 222882048, trainable para: 223M Traceback (most recent call last): File "/home/nlpir/cjy/CodeT5/CodeT5+/instruct_tune_codet5p.py", line 210, in main(args) File "/home/nlpir/cjy/CodeT5/CodeT5+/instruct_tune_codet5p.py", line 177, in main freeze_decoder_except_xattn_codegen(model) File "/home/nlpir/cjy/CodeT5/CodeT5+/instruct_tune_codet5p.py", line 42, in freeze_decoder_except_xattn_codegen num_decoder_layers = model.decoder.config.n_layer File "/home/nlpir/miniconda3/envs/cjy_ct5/lib/python3.9/site-packages/transformers/configuration_utils.py", line 257, in getattribute return super().getattribute(key) AttributeError: 'T5Config' object has no attribute 'n_layer' ==> Loaded model from baselines/codet5p-220m, model size 222882048 Para before freezing: 222882048, trainable para: 223M Traceback (most recent call last): File "/home/nlpir/cjy/CodeT5/CodeT5+/instruct_tune_codet5p.py", line 210, in main(args) File "/home/nlpir/cjy/CodeT5/CodeT5+/instruct_tune_codet5p.py", line 177, in main freeze_decoder_except_xattn_codegen(model) File "/home/nlpir/cjy/CodeT5/CodeT5+/instruct_tune_codet5p.py", line 42, in freeze_decoder_except_xattn_codegen num_decoder_layers = model.decoder.config.n_layer File "/home/nlpir/miniconda3/envs/cjy_ct5/lib/python3.9/site-packages/transformers/configuration_utils.py", line 257, in getattribute return super().getattribute(key) AttributeError: 'T5Config' object has no attribute 'n_layer' [2024-05-28 20:40:32,872] [INFO] [launch.py:319:sigkill_handler] Killing subprocess 2307758 [2024-05-28 20:40:32,873] [INFO] [launch.py:319:sigkill_handler] Killing subprocess 2307759 [2024-05-28 20:40:32,905] [ERROR] [launch.py:325:sigkill_handler] ['/home/nlpir/miniconda3/envs/cjy_ct5/bin/python', '-u', 'instruct_tune_codet5p.py', '--local_rank=1', '--load', 'baselines/codet5p-220m', '--save-dir', 'saved_models/instructcodet5p-220m', '--instruct-data-path', 'datasets/code_alpaca_20k.json', '--fp16', '--deepspeed', 'deepspeed_config.json'] exits with return code = 1

The content of my "instruct_finetune.sh" file is

!/bin/bash

export PATH=/usr/local/cuda-11.7/bin:$PATH export LD_LIBRARY_PATH=/usr/local/cuda-11.7/lib64:$LD_LIBRARY_PATH

echo "Using CUDA version:" nvcc --version

MODEL_PATH="baselines/codet5p-220m" SAVE_DIR="saved_models/instructcodet5p-220m" DATA_PATH="datasets/code_alpaca_20k.json"

deepspeed --include localhost:6,7 instruct_tune_codet5p.py \ --load $MODEL_PATH --save-dir $SAVE_DIR --instruct-data-path $DATA_PATH \ --fp16 --deepspeed deepspeed_config.json

Could you please tell me what's the problem and how to solve it? Thank you!

salesforce / CodeT5

AttributeError: 'T5Config' object has no attribute 'n_layer' #172

!/bin/bash