I got the following error which says FileNotFoundError: [Errno 2] No such file or directory: '/data/qbao775/PrefixTuning/gpt2/e2e_results_conv2/data2textprefixtune_y_5_act_cat_b=10-e=5_d=0.0_u=no_lr=8e-05_w=0.0_s=101_r=n_m=512_o=1_o=1_test_gold' when I run that command CUDA_VISIBLE_DEVICES=0 python train_e2e.py --optim_prefix yes --preseqlen 5 --epoch 5 --learning_rate 0.00008 --mode data2text --bsz 10 --seed 101 --tuning_mode prefixtune --cache_dir ./cache
Does anyone meet that issue or know how to deal with that? Thank you so much.
Training completed. Do not forget to share your model on huggingface.co/models =)
10/15/2021 20:14:10 - INFO - trainer_prefix - Saving model checkpoint to save_e2e_models_convcheck/data2textprefixtune_y_5_act_cat_b=10-e=5_d=0.0_u=no_lr=8e-05_w=0.0_s=101_r=n_m=512_o=1_o=1
10/15/2021 20:14:11 - INFO - __main__ - *** Evaluate ***
10/15/2021 20:14:11 - INFO - trainer_prefix - ***** Running Evaluation *****
10/15/2021 20:14:11 - INFO - trainer_prefix - Num examples = 42061
10/15/2021 20:14:11 - INFO - trainer_prefix - Batch size = 10
False
False
{'eval_loss': 25.165123616772462, 'epoch': 5.0, 'total_flos': 2514722051589120, 'step': 21035}
10/15/2021 20:18:41 - INFO - __main__ - ***** Eval results *****
10/15/2021 20:18:41 - INFO - __main__ - perplexity = 25.165123616772462
running evaluation on /data/qbao775/PrefixTuning/gpt2/save_e2e_models_convcheck/data2textprefixtune_y_5_act_cat_b=10-e=5_d=0.0_u=no_lr=8e-05_w=0.0_s=101_r=n_m=512_o=1_o=1
suggested code:
python gen.py data2text yes valid /data/qbao775/PrefixTuning/gpt2/save_e2e_models_convcheck/data2textprefixtune_y_5_act_cat_b=10-e=5_d=0.0_u=no_lr=8e-05_w=0.0_s=101_r=n_m=512_o=1_o=1 no
python gen.py data2text yes test /data/qbao775/PrefixTuning/gpt2/save_e2e_models_convcheck/data2textprefixtune_y_5_act_cat_b=10-e=5_d=0.0_u=no_lr=8e-05_w=0.0_s=101_r=n_m=512_o=1_o=1 no
python run_generation.py --model_type=gpt2 --length 100 --model_name_or_path=gpt2-medium --num_return_sequences 5 --stop_token [EOS] --tokenizer_name=/data/qbao775/PrefixTuning/gpt2/save_e2e_models_convcheck/data2textprefixtune_y_5_act_cat_b=10-e=5_d=0.0_u=no_lr=8e-05_w=0.0_s=101_r=n_m=512_o=1_o=1 --task_mode=data2text --control_mode=yes --tuning_mode prefixtune --gen_dir e2e_results_conv2 --eval_dataset valid --optim_prefix no --preseqlen 20 --prefix_mode activation --format_mode cat --prefixModel_name_or_path /data/qbao775/PrefixTuning/gpt2/save_e2e_models_convcheck/data2textprefixtune_y_5_act_cat_b=10-e=5_d=0.0_u=no_lr=8e-05_w=0.0_s=101_r=n_m=512_o=1_o=1 --cache_dir ./cache/gpt2-medium-s3
10/15/2021 20:18:42 - WARNING - __main__ - device: cuda, n_gpu: 1, 16-bits training: False
loading from PrefixTuning. /data/qbao775/PrefixTuning/gpt2/save_e2e_models_convcheck/data2textprefixtune_y_5_act_cat_b=10-e=5_d=0.0_u=no_lr=8e-05_w=0.0_s=101_r=n_m=512_o=1_o=1
loading the trained tokenizer
Using pad_token, but it is not set yet.
50257 <|endoftext|> <|endoftext|> None
50256
<|endoftext|>
None
<|endoftext|> 50256
50257 <|endoftext|> <|endoftext|> <|endoftext|>
GPT2Config {
"_my_arg_task_mode": "data2text",
"_my_arg_tune_mode": "prefixtune",
"_objective_mode": 2,
"activation_function": "gelu_new",
"architectures": [
"GPT2LMHeadModel"
],
"attn_pdrop": 0.1,
"bos_token_id": 50256,
"embd_pdrop": 0.1,
"eos_token_id": 50256,
"initializer_range": 0.02,
"layer_norm_epsilon": 1e-05,
"model_type": "gpt2",
"n_ctx": 1024,
"n_embd": 1024,
"n_head": 16,
"n_inner": null,
"n_layer": 24,
"n_positions": 1024,
"n_special": 0,
"predict_special_tokens": true,
"resid_pdrop": 0.1,
"summary_activation": null,
"summary_first_dropout": 0.1,
"summary_proj_to_labels": true,
"summary_type": "cls_index",
"summary_use_proj": true,
"task_specific_params": {
"text-generation": {
"do_sample": true,
"max_length": 50
}
},
"vocab_size": 50257
}
GPT2Config {
"_my_arg_control": true,
"_my_arg_task_mode": "data2text",
"_my_arg_tune_mode": "prefixtune",
"activation_function": "gelu_new",
"architectures": [
"PrefixTuning"
],
"attn_pdrop": 0.1,
"bos_token_id": 50256,
"embd_pdrop": 0.1,
"eos_token_id": 50256,
"format_mode": "cat",
"init_random": "no",
"init_shallow": "no",
"initializer_range": 0.02,
"layer_norm_epsilon": 1e-05,
"lowdata": false,
"mid_dim": 512,
"model_type": "gpt2",
"n_ctx": 1024,
"n_embd": 1024,
"n_head": 16,
"n_inner": null,
"n_layer": 24,
"n_positions": 1024,
"n_special": 0,
"optim_prefix": true,
"predict_special_tokens": true,
"prefix_dropout": 0.0,
"preseqlen": 5,
"resid_pdrop": 0.1,
"summary_activation": null,
"summary_first_dropout": 0.1,
"summary_proj_to_labels": true,
"summary_type": "cls_index",
"summary_use_proj": true,
"task_specific_params": {
"text-generation": {
"do_sample": true,
"max_length": 50
}
},
"train_weights": "no",
"use_infix": false,
"vocab_size": 50258
}
under the PrefixTuning model
PrefixTuning
preseqlen is 5, optimizing the prefix directly
[Full prefix-tuning Setting :) ]
torch.Size([5, 1024])
torch.Size([512, 1024])
torch.Size([512])
torch.Size([49152, 512])
torch.Size([49152])
total param is 25744896
10/15/2021 20:19:00 - INFO - __main__ - Namespace(cache_dir='./cache/gpt2-medium-s3', control_dataless='no', control_mode='yes', device=device(type='cuda'), eval_dataset='valid', format_mode='cat', fp16=False, gen_dir='e2e_results_conv2', k=0, length=100, model_name_or_path='gpt2-medium', model_type='gpt2', n_gpu=1, no_cuda=False, num_return_sequences=5, objective_mode=2, optim_prefix='no', p=0.9, padding_text='', prefix='', prefixModel_name_or_path='/data/qbao775/PrefixTuning/gpt2/save_e2e_models_convcheck/data2textprefixtune_y_5_act_cat_b=10-e=5_d=0.0_u=no_lr=8e-05_w=0.0_s=101_r=n_m=512_o=1_o=1', prefix_mode='activation', preseqlen=20, prompt='', repetition_penalty=1.0, seed=42, stop_token='[EOS]', task_mode='data2text', temperature=1.0, tokenizer_name='/data/qbao775/PrefixTuning/gpt2/save_e2e_models_convcheck/data2textprefixtune_y_5_act_cat_b=10-e=5_d=0.0_u=no_lr=8e-05_w=0.0_s=101_r=n_m=512_o=1_o=1', tuning_mode='prefixtune', xlm_language='')
using the test path /data/qbao775/PrefixTuning/data/e2e_data/src1_valid.txt
/data/qbao775/PrefixTuning/gpt2/e2e_results_conv2/data2textprefixtune_y_5_act_cat_b=10-e=5_d=0.0_u=no_lr=8e-05_w=0.0_s=101_r=n_m=512_o=1_o=1_valid_beam
/data/qbao775/PrefixTuning/gpt2/e2e_results_conv2/data2textprefixtune_y_5_act_cat_b=10-e=5_d=0.0_u=no_lr=8e-05_w=0.0_s=101_r=n_m=512_o=1_o=1_valid_gold
547
Traceback (most recent call last):
File "run_generation.py", line 1356, in <module>
main()
File "run_generation.py", line 825, in main
write_e2e_corr(prompt_text_lst, prompt_text_dict, gold_dir)
File "run_generation.py", line 360, in write_e2e_corr
with open(corr_path, 'w') as f:
FileNotFoundError: [Errno 2] No such file or directory: '/data/qbao775/PrefixTuning/gpt2/e2e_results_conv2/data2textprefixtune_y_5_act_cat_b=10-e=5_d=0.0_u=no_lr=8e-05_w=0.0_s=101_r=n_m=512_o=1_o=1_valid_gold'
python run_generation.py --model_type=gpt2 --length 100 --model_name_or_path=gpt2-medium --num_return_sequences 5 --stop_token [EOS] --tokenizer_name=/data/qbao775/PrefixTuning/gpt2/save_e2e_models_convcheck/data2textprefixtune_y_5_act_cat_b=10-e=5_d=0.0_u=no_lr=8e-05_w=0.0_s=101_r=n_m=512_o=1_o=1 --task_mode=data2text --control_mode=yes --tuning_mode prefixtune --gen_dir e2e_results_conv2 --eval_dataset test --optim_prefix no --preseqlen 20 --prefix_mode activation --format_mode cat --prefixModel_name_or_path /data/qbao775/PrefixTuning/gpt2/save_e2e_models_convcheck/data2textprefixtune_y_5_act_cat_b=10-e=5_d=0.0_u=no_lr=8e-05_w=0.0_s=101_r=n_m=512_o=1_o=1 --cache_dir ./cache/gpt2-medium-s3
10/15/2021 20:19:02 - WARNING - __main__ - device: cuda, n_gpu: 1, 16-bits training: False
loading from PrefixTuning. /data/qbao775/PrefixTuning/gpt2/save_e2e_models_convcheck/data2textprefixtune_y_5_act_cat_b=10-e=5_d=0.0_u=no_lr=8e-05_w=0.0_s=101_r=n_m=512_o=1_o=1
loading the trained tokenizer
Using pad_token, but it is not set yet.
50257 <|endoftext|> <|endoftext|> None
50256
<|endoftext|>
None
<|endoftext|> 50256
50257 <|endoftext|> <|endoftext|> <|endoftext|>
GPT2Config {
"_my_arg_task_mode": "data2text",
"_my_arg_tune_mode": "prefixtune",
"_objective_mode": 2,
"activation_function": "gelu_new",
"architectures": [
"GPT2LMHeadModel"
],
"attn_pdrop": 0.1,
"bos_token_id": 50256,
"embd_pdrop": 0.1,
"eos_token_id": 50256,
"initializer_range": 0.02,
"layer_norm_epsilon": 1e-05,
"model_type": "gpt2",
"n_ctx": 1024,
"n_embd": 1024,
"n_head": 16,
"n_inner": null,
"n_layer": 24,
"n_positions": 1024,
"n_special": 0,
"predict_special_tokens": true,
"resid_pdrop": 0.1,
"summary_activation": null,
"summary_first_dropout": 0.1,
"summary_proj_to_labels": true,
"summary_type": "cls_index",
"summary_use_proj": true,
"task_specific_params": {
"text-generation": {
"do_sample": true,
"max_length": 50
}
},
"vocab_size": 50257
}
GPT2Config {
"_my_arg_control": true,
"_my_arg_task_mode": "data2text",
"_my_arg_tune_mode": "prefixtune",
"activation_function": "gelu_new",
"architectures": [
"PrefixTuning"
],
"attn_pdrop": 0.1,
"bos_token_id": 50256,
"embd_pdrop": 0.1,
"eos_token_id": 50256,
"format_mode": "cat",
"init_random": "no",
"init_shallow": "no",
"initializer_range": 0.02,
"layer_norm_epsilon": 1e-05,
"lowdata": false,
"mid_dim": 512,
"model_type": "gpt2",
"n_ctx": 1024,
"n_embd": 1024,
"n_head": 16,
"n_inner": null,
"n_layer": 24,
"n_positions": 1024,
"n_special": 0,
"optim_prefix": true,
"predict_special_tokens": true,
"prefix_dropout": 0.0,
"preseqlen": 5,
"resid_pdrop": 0.1,
"summary_activation": null,
"summary_first_dropout": 0.1,
"summary_proj_to_labels": true,
"summary_type": "cls_index",
"summary_use_proj": true,
"task_specific_params": {
"text-generation": {
"do_sample": true,
"max_length": 50
}
},
"train_weights": "no",
"use_infix": false,
"vocab_size": 50258
}
under the PrefixTuning model
PrefixTuning
preseqlen is 5, optimizing the prefix directly
[Full prefix-tuning Setting :) ]
torch.Size([5, 1024])
torch.Size([512, 1024])
torch.Size([512])
torch.Size([49152, 512])
torch.Size([49152])
total param is 25744896
10/15/2021 20:19:20 - INFO - __main__ - Namespace(cache_dir='./cache/gpt2-medium-s3', control_dataless='no', control_mode='yes', device=device(type='cuda'), eval_dataset='test', format_mode='cat', fp16=False, gen_dir='e2e_results_conv2', k=0, length=100, model_name_or_path='gpt2-medium', model_type='gpt2', n_gpu=1, no_cuda=False, num_return_sequences=5, objective_mode=2, optim_prefix='no', p=0.9, padding_text='', prefix='', prefixModel_name_or_path='/data/qbao775/PrefixTuning/gpt2/save_e2e_models_convcheck/data2textprefixtune_y_5_act_cat_b=10-e=5_d=0.0_u=no_lr=8e-05_w=0.0_s=101_r=n_m=512_o=1_o=1', prefix_mode='activation', preseqlen=20, prompt='', repetition_penalty=1.0, seed=42, stop_token='[EOS]', task_mode='data2text', temperature=1.0, tokenizer_name='/data/qbao775/PrefixTuning/gpt2/save_e2e_models_convcheck/data2textprefixtune_y_5_act_cat_b=10-e=5_d=0.0_u=no_lr=8e-05_w=0.0_s=101_r=n_m=512_o=1_o=1', tuning_mode='prefixtune', xlm_language='')
using the test path /data/qbao775/PrefixTuning/data/e2e_data/src1_test.txt
/data/qbao775/PrefixTuning/gpt2/e2e_results_conv2/data2textprefixtune_y_5_act_cat_b=10-e=5_d=0.0_u=no_lr=8e-05_w=0.0_s=101_r=n_m=512_o=1_o=1_test_beam
/data/qbao775/PrefixTuning/gpt2/e2e_results_conv2/data2textprefixtune_y_5_act_cat_b=10-e=5_d=0.0_u=no_lr=8e-05_w=0.0_s=101_r=n_m=512_o=1_o=1_test_gold
630
Traceback (most recent call last):
File "run_generation.py", line 1356, in <module>
main()
File "run_generation.py", line 825, in main
write_e2e_corr(prompt_text_lst, prompt_text_dict, gold_dir)
File "run_generation.py", line 360, in write_e2e_corr
with open(corr_path, 'w') as f:
FileNotFoundError: [Errno 2] No such file or directory: '/data/qbao775/PrefixTuning/gpt2/e2e_results_conv2/data2textprefixtune_y_5_act_cat_b=10-e=5_d=0.0_u=no_lr=8e-05_w=0.0_s=101_r=n_m=512_o=1_o=1_test_gold'
Hi,
I got the following error which says
FileNotFoundError: [Errno 2] No such file or directory: '/data/qbao775/PrefixTuning/gpt2/e2e_results_conv2/data2textprefixtune_y_5_act_cat_b=10-e=5_d=0.0_u=no_lr=8e-05_w=0.0_s=101_r=n_m=512_o=1_o=1_test_gold'
when I run that commandCUDA_VISIBLE_DEVICES=0 python train_e2e.py --optim_prefix yes --preseqlen 5 --epoch 5 --learning_rate 0.00008 --mode data2text --bsz 10 --seed 101 --tuning_mode prefixtune --cache_dir ./cache
Does anyone meet that issue or know how to deal with that? Thank you so much.
Here are my environment configuration: