wasiahmad / PLBART

Official code of our work, Unified Pre-training for Program Understanding and Generation [NAACL 2021].
https://arxiv.org/abs/2103.06333
MIT License
186 stars 35 forks source link

Fine-tuning Error : " AttributeError: module 'sacrebleu' has no attribute '__version__' " #34

Closed openRiemann closed 2 years ago

openRiemann commented 2 years ago

Issue

I followed the Fine-tuning steps given here. But when I run the instructions given by step4, I get the following error:

/content/drive/MyDrive/nl2code/projects/plbart/PLBART/scripts/text_to_code
2022-03-29 10:12:08 | INFO | numexpr.utils | NumExpr defaulting to 2 threads.
2022-03-29 10:12:09 | INFO | fairseq.tasks.text_to_speech | Please install tensorboardX: pip install tensorboardX
Traceback (most recent call last):
  File "/usr/local/bin/fairseq-train", line 5, in <module>
    from fairseq_cli.train import cli_main
  File "/usr/local/lib/python3.7/dist-packages/fairseq_cli/train.py", line 30, in <module>
    from fairseq import checkpoint_utils, options, quantization_utils, tasks, utils
  File "/usr/local/lib/python3.7/dist-packages/fairseq/__init__.py", line 40, in <module>
    import fairseq.scoring  # noqa
  File "/usr/local/lib/python3.7/dist-packages/fairseq/scoring/__init__.py", line 55, in <module>
    importlib.import_module("fairseq.scoring." + module)
  File "/usr/lib/python3.7/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "/usr/local/lib/python3.7/dist-packages/fairseq/scoring/bleu.py", line 14, in <module>
    from fairseq.scoring.tokenizer import EvaluationTokenizer
  File "/usr/local/lib/python3.7/dist-packages/fairseq/scoring/tokenizer.py", line 12, in <module>
    SACREBLEU_V2_ABOVE = int(sb.__version__[0]) >= 2
AttributeError: module 'sacrebleu' has no attribute '__version__'
2022-03-29 10:12:12 | INFO | fairseq.tasks.text_to_speech | Please install tensorboardX: pip install tensorboardX
Traceback (most recent call last):
  File "/usr/local/bin/fairseq-generate", line 5, in <module>
    from fairseq_cli.generate import cli_main
  File "/usr/local/lib/python3.7/dist-packages/fairseq_cli/generate.py", line 22, in <module>
    from fairseq import checkpoint_utils, options, scoring, tasks, utils
  File "/usr/local/lib/python3.7/dist-packages/fairseq/__init__.py", line 40, in <module>
    import fairseq.scoring  # noqa
  File "/usr/local/lib/python3.7/dist-packages/fairseq/scoring/__init__.py", line 55, in <module>
    importlib.import_module("fairseq.scoring." + module)
  File "/usr/lib/python3.7/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "/usr/local/lib/python3.7/dist-packages/fairseq/scoring/bleu.py", line 14, in <module>
    from fairseq.scoring.tokenizer import EvaluationTokenizer
  File "/usr/local/lib/python3.7/dist-packages/fairseq/scoring/tokenizer.py", line 12, in <module>
    SACREBLEU_V2_ABOVE = int(sb.__version__[0]) >= 2
AttributeError: module 'sacrebleu' has no attribute '__version__'
Traceback (most recent call last):
  File "evaluator.py", line 50, in <module>
    main()
  File "evaluator.py", line 26, in main
    assert len(preds) == len(gts), f"Samples of predictions and answers are not equal, {len(preds)}: {len(gts)}"
AssertionError: Samples of predictions and answers are not equal, 0: 2000
Traceback (most recent call last):
  File "/content/drive/MyDrive/nl2code/projects/plbart/PLBART/evaluation/CodeBLEU/calc_code_bleu.py", line 110, in <module>
    main()
  File "/content/drive/MyDrive/nl2code/projects/plbart/PLBART/evaluation/CodeBLEU/calc_code_bleu.py", line 85, in main
    assert len(hypothesis) == len(pre_references[i])
AssertionError
/content/drive/MyDrive/nl2code/projects/plbart/PLBART

Then I tried to reinstall sacrebleu

!pip uninstall -y sacrebleu
!pip install sacrebleu
import sacrebleu
sacrebleu.__version__
>>>'2.0.0'

Then I re-executed step4, but got new error:

/content/drive/MyDrive/nl2code/projects/plbart/PLBART/scripts/text_to_code
2022-03-29 10:28:29 | INFO | numexpr.utils | NumExpr defaulting to 2 threads.
2022-03-29 10:28:29 | INFO | fairseq.tasks.text_to_speech | Please install tensorboardX: pip install tensorboardX
usage: fairseq-train [-h] [--no-progress-bar] [--log-interval LOG_INTERVAL]
                     [--log-format {json,none,simple,tqdm}]
                     [--log-file LOG_FILE]
                     [--tensorboard-logdir TENSORBOARD_LOGDIR]
                     [--wandb-project WANDB_PROJECT] [--azureml-logging]
                     [--seed SEED] [--cpu] [--tpu] [--bf16]
                     [--memory-efficient-bf16] [--fp16]
                     [--memory-efficient-fp16] [--fp16-no-flatten-grads]
                     [--fp16-init-scale FP16_INIT_SCALE]
                     [--fp16-scale-window FP16_SCALE_WINDOW]
                     [--fp16-scale-tolerance FP16_SCALE_TOLERANCE]
                     [--on-cpu-convert-precision]
                     [--min-loss-scale MIN_LOSS_SCALE]
                     [--threshold-loss-scale THRESHOLD_LOSS_SCALE] [--amp]
                     [--amp-batch-retries AMP_BATCH_RETRIES]
                     [--amp-init-scale AMP_INIT_SCALE]
                     [--amp-scale-window AMP_SCALE_WINDOW]
                     [--user-dir USER_DIR]
                     [--empty-cache-freq EMPTY_CACHE_FREQ]
                     [--all-gather-list-size ALL_GATHER_LIST_SIZE]
                     [--model-parallel-size MODEL_PARALLEL_SIZE]
                     [--quantization-config-path QUANTIZATION_CONFIG_PATH]
                     [--profile] [--reset-logging] [--suppress-crashes]
                     [--use-plasma-view] [--plasma-path PLASMA_PATH]
                     [--criterion {adaptive_loss,composite_loss,cross_entropy,ctc,fastspeech2,hubert,label_smoothed_cross_entropy,latency_augmented_label_smoothed_cross_entropy,label_smoothed_cross_entropy_with_alignment,label_smoothed_cross_entropy_with_ctc,legacy_masked_lm_loss,masked_lm,model,nat_loss,sentence_prediction,sentence_ranking,tacotron2,speech_to_unit,speech_to_spectrogram,speech_unit_lm_criterion,wav2vec,vocab_parallel_cross_entropy}]
                     [--tokenizer {moses,nltk,space}]
                     [--bpe {byte_bpe,bytes,characters,fastbpe,gpt2,bert,hf_byte_bpe,sentencepiece,subword_nmt}]
                     [--optimizer {adadelta,adafactor,adagrad,adam,adamax,composite,cpu_adam,lamb,nag,sgd}]
                     [--lr-scheduler {cosine,fixed,inverse_sqrt,manual,pass_through,polynomial_decay,reduce_lr_on_plateau,step,tri_stage,triangular}]
                     [--scoring {bert_score,sacrebleu,bleu,chrf,meteor,wer}]
                     [--task TASK] [--num-workers NUM_WORKERS]
                     [--skip-invalid-size-inputs-valid-test]
                     [--max-tokens MAX_TOKENS] [--batch-size BATCH_SIZE]
                     [--required-batch-size-multiple REQUIRED_BATCH_SIZE_MULTIPLE]
                     [--required-seq-len-multiple REQUIRED_SEQ_LEN_MULTIPLE]
                     [--dataset-impl {raw,lazy,cached,mmap,fasta,huffman}]
                     [--data-buffer-size DATA_BUFFER_SIZE]
                     [--train-subset TRAIN_SUBSET]
                     [--valid-subset VALID_SUBSET] [--combine-valid-subsets]
                     [--ignore-unused-valid-subsets]
                     [--validate-interval VALIDATE_INTERVAL]
                     [--validate-interval-updates VALIDATE_INTERVAL_UPDATES]
                     [--validate-after-updates VALIDATE_AFTER_UPDATES]
                     [--fixed-validation-seed FIXED_VALIDATION_SEED]
                     [--disable-validation]
                     [--max-tokens-valid MAX_TOKENS_VALID]
                     [--batch-size-valid BATCH_SIZE_VALID]
                     [--max-valid-steps MAX_VALID_STEPS]
                     [--curriculum CURRICULUM] [--gen-subset GEN_SUBSET]
                     [--num-shards NUM_SHARDS] [--shard-id SHARD_ID]
                     [--grouped-shuffling]
                     [--update-epoch-batch-itr UPDATE_EPOCH_BATCH_ITR]
                     [--update-ordered-indices-seed]
                     [--distributed-world-size DISTRIBUTED_WORLD_SIZE]
                     [--distributed-num-procs DISTRIBUTED_NUM_PROCS]
                     [--distributed-rank DISTRIBUTED_RANK]
                     [--distributed-backend DISTRIBUTED_BACKEND]
                     [--distributed-init-method DISTRIBUTED_INIT_METHOD]
                     [--distributed-port DISTRIBUTED_PORT]
                     [--device-id DEVICE_ID] [--distributed-no-spawn]
                     [--ddp-backend {c10d,fully_sharded,legacy_ddp,no_c10d,pytorch_ddp,slowmo}]
                     [--ddp-comm-hook {none,fp16}]
                     [--bucket-cap-mb BUCKET_CAP_MB] [--fix-batches-to-gpus]
                     [--find-unused-parameters] [--gradient-as-bucket-view]
                     [--fast-stat-sync]
                     [--heartbeat-timeout HEARTBEAT_TIMEOUT]
                     [--broadcast-buffers] [--slowmo-momentum SLOWMO_MOMENTUM]
                     [--slowmo-base-algorithm SLOWMO_BASE_ALGORITHM]
                     [--localsgd-frequency LOCALSGD_FREQUENCY]
                     [--nprocs-per-node NPROCS_PER_NODE]
                     [--pipeline-model-parallel]
                     [--pipeline-balance PIPELINE_BALANCE]
                     [--pipeline-devices PIPELINE_DEVICES]
                     [--pipeline-chunks PIPELINE_CHUNKS]
                     [--pipeline-encoder-balance PIPELINE_ENCODER_BALANCE]
                     [--pipeline-encoder-devices PIPELINE_ENCODER_DEVICES]
                     [--pipeline-decoder-balance PIPELINE_DECODER_BALANCE]
                     [--pipeline-decoder-devices PIPELINE_DECODER_DEVICES]
                     [--pipeline-checkpoint {always,never,except_last}]
                     [--zero-sharding {none,os}] [--no-reshard-after-forward]
                     [--fp32-reduce-scatter] [--cpu-offload]
                     [--use-sharded-state] [--not-fsdp-flatten-parameters]
                     [--arch ARCH] [--max-epoch MAX_EPOCH]
                     [--max-update MAX_UPDATE]
                     [--stop-time-hours STOP_TIME_HOURS]
                     [--clip-norm CLIP_NORM] [--sentence-avg]
                     [--update-freq UPDATE_FREQ] [--lr LR]
                     [--stop-min-lr STOP_MIN_LR] [--use-bmuf]
                     [--skip-remainder-batch] [--save-dir SAVE_DIR]
                     [--restore-file RESTORE_FILE]
                     [--continue-once CONTINUE_ONCE]
                     [--finetune-from-model FINETUNE_FROM_MODEL]
                     [--reset-dataloader] [--reset-lr-scheduler]
                     [--reset-meters] [--reset-optimizer]
                     [--optimizer-overrides OPTIMIZER_OVERRIDES]
                     [--save-interval SAVE_INTERVAL]
                     [--save-interval-updates SAVE_INTERVAL_UPDATES]
                     [--keep-interval-updates KEEP_INTERVAL_UPDATES]
                     [--keep-interval-updates-pattern KEEP_INTERVAL_UPDATES_PATTERN]
                     [--keep-last-epochs KEEP_LAST_EPOCHS]
                     [--keep-best-checkpoints KEEP_BEST_CHECKPOINTS]
                     [--no-save] [--no-epoch-checkpoints]
                     [--no-last-checkpoints] [--no-save-optimizer-state]
                     [--best-checkpoint-metric BEST_CHECKPOINT_METRIC]
                     [--maximize-best-checkpoint-metric] [--patience PATIENCE]
                     [--checkpoint-suffix CHECKPOINT_SUFFIX]
                     [--checkpoint-shard-count CHECKPOINT_SHARD_COUNT]
                     [--load-checkpoint-on-all-dp-ranks]
                     [--write-checkpoints-asynchronously] [--store-ema]
                     [--ema-decay EMA_DECAY]
                     [--ema-start-update EMA_START_UPDATE]
                     [--ema-seed-model EMA_SEED_MODEL]
                     [--ema-update-freq EMA_UPDATE_FREQ] [--ema-fp32]
                     [--activation-fn {relu,gelu,gelu_fast,gelu_accurate,tanh,linear}]
                     [--dropout DROPOUT]
                     [--attention-dropout ATTENTION_DROPOUT]
                     [--activation-dropout ACTIVATION_DROPOUT]
                     [--adaptive-input]
                     [--encoder-embed-path ENCODER_EMBED_PATH]
                     [--encoder-embed-dim ENCODER_EMBED_DIM]
                     [--encoder-ffn-embed-dim ENCODER_FFN_EMBED_DIM]
                     [--encoder-layers ENCODER_LAYERS]
                     [--encoder-attention-heads ENCODER_ATTENTION_HEADS]
                     [--encoder-normalize-before] [--encoder-learned-pos]
                     [--encoder-layerdrop ENCODER_LAYERDROP]
                     [--encoder-layers-to-keep ENCODER_LAYERS_TO_KEEP]
                     [--max-source-positions MAX_SOURCE_POSITIONS]
                     [--decoder-embed-path DECODER_EMBED_PATH]
                     [--decoder-embed-dim DECODER_EMBED_DIM]
                     [--decoder-ffn-embed-dim DECODER_FFN_EMBED_DIM]
                     [--decoder-layers DECODER_LAYERS]
                     [--decoder-attention-heads DECODER_ATTENTION_HEADS]
                     [--decoder-normalize-before] [--decoder-learned-pos]
                     [--decoder-layerdrop DECODER_LAYERDROP]
                     [--decoder-layers-to-keep DECODER_LAYERS_TO_KEEP]
                     [--decoder-output-dim DECODER_OUTPUT_DIM]
                     [--max-target-positions MAX_TARGET_POSITIONS]
                     [--share-decoder-input-output-embed]
                     [--share-all-embeddings]
                     [--no-token-positional-embeddings]
                     [--adaptive-softmax-cutoff ADAPTIVE_SOFTMAX_CUTOFF]
                     [--adaptive-softmax-dropout ADAPTIVE_SOFTMAX_DROPOUT]
                     [--adaptive-softmax-factor ADAPTIVE_SOFTMAX_FACTOR]
                     [--layernorm-embedding] [--tie-adaptive-weights]
                     [--tie-adaptive-proj] [--no-scale-embedding]
                     [--checkpoint-activations] [--offload-activations]
                     [--no-cross-attention] [--cross-self-attention]
                     [--quant-noise-pq QUANT_NOISE_PQ]
                     [--quant-noise-pq-block-size QUANT_NOISE_PQ_BLOCK_SIZE]
                     [--quant-noise-scalar QUANT_NOISE_SCALAR]
                     [--min-params-to-wrap MIN_PARAMS_TO_WRAP] [--char-inputs]
                     [--relu-dropout RELU_DROPOUT] [--base-layers BASE_LAYERS]
                     [--base-sublayers BASE_SUBLAYERS]
                     [--base-shuffle BASE_SHUFFLE] [--export]
                     [--no-decoder-final-norm] [--pooler-dropout D]
                     [--pooler-activation-fn {relu,gelu,gelu_fast,gelu_accurate,tanh,linear}]
                     [--spectral-norm-classification-head]
                     [--source-lang SOURCE_LANG] [--target-lang TARGET_LANG]
                     [--load-alignments] [--left-pad-source]
                     [--left-pad-target] [--upsample-primary UPSAMPLE_PRIMARY]
                     [--truncate-source]
                     [--num-batch-buckets NUM_BATCH_BUCKETS] [--eval-bleu]
                     [--eval-bleu-args EVAL_BLEU_ARGS]
                     [--eval-bleu-detok EVAL_BLEU_DETOK]
                     [--eval-bleu-detok-args EVAL_BLEU_DETOK_ARGS]
                     [--eval-tokenized-bleu]
                     [--eval-bleu-remove-bpe [EVAL_BLEU_REMOVE_BPE]]
                     [--eval-bleu-print-samples] [--langs LANG]
                     [--prepend-bos] [--label-smoothing LABEL_SMOOTHING]
                     [--report-accuracy]
                     [--ignore-prefix-size IGNORE_PREFIX_SIZE]
                     [--sentencepiece-model SENTENCEPIECE_MODEL]
                     [--sentencepiece-enable-sampling]
                     [--sentencepiece-alpha SENTENCEPIECE_ALPHA]
                     [--adam-betas ADAM_BETAS] [--adam-eps ADAM_EPS]
                     [--weight-decay WEIGHT_DECAY] [--use-old-adam]
                     [--fp16-adam-stats] [--warmup-updates WARMUP_UPDATES]
                     [--force-anneal FORCE_ANNEAL]
                     [--end-learning-rate END_LEARNING_RATE] [--power POWER]
                     [--total-num-update TOTAL_NUM_UPDATE] [--pad PAD]
                     [--eos EOS] [--unk UNK]
                     data
fairseq-train: error: unrecognized arguments: --min-lr -1
2022-03-29 10:28:32 | INFO | fairseq.tasks.text_to_speech | Please install tensorboardX: pip install tensorboardX
2022-03-29 10:28:35 | INFO | fairseq_cli.generate | {'_name': None, 'common': {'_name': None, 'no_progress_bar': False, 'log_interval': 100, 'log_format': None, 'log_file': None, 'tensorboard_logdir': None, 'wandb_project': None, 'azureml_logging': False, 'seed': 1, 'cpu': False, 'tpu': False, 'bf16': False, 'memory_efficient_bf16': False, 'fp16': False, 'memory_efficient_fp16': False, 'fp16_no_flatten_grads': False, 'fp16_init_scale': 128, 'fp16_scale_window': None, 'fp16_scale_tolerance': 0.0, 'on_cpu_convert_precision': False, 'min_loss_scale': 0.0001, 'threshold_loss_scale': None, 'amp': False, 'amp_batch_retries': 2, 'amp_init_scale': 128, 'amp_scale_window': None, 'user_dir': None, 'empty_cache_freq': 0, 'all_gather_list_size': 16384, 'model_parallel_size': 1, 'quantization_config_path': None, 'profile': False, 'reset_logging': False, 'suppress_crashes': False, 'use_plasma_view': False, 'plasma_path': '/tmp/plasma'}, 'common_eval': {'_name': None, 'path': '/content/drive/MyDrive/nl2code/projects/plbart/PLBART/scripts/text_to_code/base/concode/checkpoint_best.pt', 'post_process': 'sentencepiece', 'quiet': False, 'model_overrides': '{}', 'results_path': None}, 'distributed_training': {'_name': None, 'distributed_world_size': 1, 'distributed_num_procs': 1, 'distributed_rank': 0, 'distributed_backend': 'nccl', 'distributed_init_method': None, 'distributed_port': -1, 'device_id': 0, 'distributed_no_spawn': False, 'ddp_backend': 'pytorch_ddp', 'ddp_comm_hook': 'none', 'bucket_cap_mb': 25, 'fix_batches_to_gpus': False, 'find_unused_parameters': False, 'gradient_as_bucket_view': False, 'fast_stat_sync': False, 'heartbeat_timeout': -1, 'broadcast_buffers': False, 'slowmo_momentum': None, 'slowmo_base_algorithm': 'localsgd', 'localsgd_frequency': 3, 'nprocs_per_node': 1, 'pipeline_model_parallel': False, 'pipeline_balance': None, 'pipeline_devices': None, 'pipeline_chunks': 0, 'pipeline_encoder_balance': None, 'pipeline_encoder_devices': None, 'pipeline_decoder_balance': None, 'pipeline_decoder_devices': None, 'pipeline_checkpoint': 'never', 'zero_sharding': 'none', 'fp16': False, 'memory_efficient_fp16': False, 'tpu': False, 'no_reshard_after_forward': False, 'fp32_reduce_scatter': False, 'cpu_offload': False, 'use_sharded_state': False, 'not_fsdp_flatten_parameters': False}, 'dataset': {'_name': None, 'num_workers': 1, 'skip_invalid_size_inputs_valid_test': False, 'max_tokens': None, 'batch_size': 4, 'required_batch_size_multiple': 8, 'required_seq_len_multiple': 1, 'dataset_impl': None, 'data_buffer_size': 10, 'train_subset': 'train', 'valid_subset': 'valid', 'combine_valid_subsets': None, 'ignore_unused_valid_subsets': False, 'validate_interval': 1, 'validate_interval_updates': 0, 'validate_after_updates': 0, 'fixed_validation_seed': None, 'disable_validation': False, 'max_tokens_valid': None, 'batch_size_valid': 4, 'max_valid_steps': None, 'curriculum': 0, 'gen_subset': 'test', 'num_shards': 1, 'shard_id': 0, 'grouped_shuffling': False, 'update_epoch_batch_itr': False, 'update_ordered_indices_seed': False}, 'optimization': {'_name': None, 'max_epoch': 0, 'max_update': 0, 'stop_time_hours': 0.0, 'clip_norm': 0.0, 'sentence_avg': False, 'update_freq': [1], 'lr': [0.25], 'stop_min_lr': -1.0, 'use_bmuf': False, 'skip_remainder_batch': False}, 'checkpoint': {'_name': None, 'save_dir': 'checkpoints', 'restore_file': 'checkpoint_last.pt', 'continue_once': None, 'finetune_from_model': None, 'reset_dataloader': False, 'reset_lr_scheduler': False, 'reset_meters': False, 'reset_optimizer': False, 'optimizer_overrides': '{}', 'save_interval': 1, 'save_interval_updates': 0, 'keep_interval_updates': -1, 'keep_interval_updates_pattern': -1, 'keep_last_epochs': -1, 'keep_best_checkpoints': -1, 'no_save': False, 'no_epoch_checkpoints': False, 'no_last_checkpoints': False, 'no_save_optimizer_state': False, 'best_checkpoint_metric': 'loss', 'maximize_best_checkpoint_metric': False, 'patience': -1, 'checkpoint_suffix': '', 'checkpoint_shard_count': 1, 'load_checkpoint_on_all_dp_ranks': False, 'write_checkpoints_asynchronously': False, 'model_parallel_size': 1}, 'bmuf': {'_name': None, 'block_lr': 1.0, 'block_momentum': 0.875, 'global_sync_iter': 50, 'warmup_iterations': 500, 'use_nbm': False, 'average_sync': False, 'distributed_world_size': 1}, 'generation': {'_name': None, 'beam': 10, 'nbest': 1, 'max_len_a': 0.0, 'max_len_b': 200, 'min_len': 1, 'match_source_len': False, 'unnormalized': False, 'no_early_stop': False, 'no_beamable_mm': False, 'lenpen': 1.0, 'unkpen': 0.0, 'replace_unk': None, 'sacrebleu': True, 'score_reference': False, 'prefix_size': 0, 'no_repeat_ngram_size': 0, 'sampling': False, 'sampling_topk': -1, 'sampling_topp': -1.0, 'constraints': None, 'temperature': 1.0, 'diverse_beam_groups': -1, 'diverse_beam_strength': 0.5, 'diversity_rate': -1.0, 'print_alignment': None, 'print_step': False, 'lm_path': None, 'lm_weight': 0.0, 'iter_decode_eos_penalty': 0.0, 'iter_decode_max_iter': 10, 'iter_decode_force_max_iter': False, 'iter_decode_with_beam': 1, 'iter_decode_with_external_reranker': False, 'retain_iter_history': False, 'retain_dropout': False, 'retain_dropout_modules': None, 'decoding_format': None, 'no_seed_provided': False}, 'eval_lm': {'_name': None, 'output_word_probs': False, 'output_word_stats': False, 'context_window': 0, 'softmax_batch': 9223372036854775807}, 'interactive': {'_name': None, 'buffer_size': 0, 'input': '-'}, 'model': {'_name': 'wav2vec2', 'extractor_mode': 'default', 'encoder_layers': 12, 'encoder_embed_dim': 768, 'encoder_ffn_embed_dim': 3072, 'encoder_attention_heads': 12, 'activation_fn': 'gelu', 'layer_type': 'transformer', 'dropout': 0.1, 'attention_dropout': 0.1, 'activation_dropout': 0.0, 'encoder_layerdrop': 0.0, 'dropout_input': 0.0, 'dropout_features': 0.0, 'final_dim': 0, 'layer_norm_first': False, 'conv_feature_layers': '[(512, 10, 5)] + [(512, 3, 2)] * 4 + [(512,2,2)] + [(512,2,2)]', 'conv_bias': False, 'logit_temp': 0.1, 'quantize_targets': False, 'quantize_input': False, 'same_quantizer': False, 'target_glu': False, 'feature_grad_mult': 1.0, 'quantizer_depth': 1, 'quantizer_factor': 3, 'latent_vars': 320, 'latent_groups': 2, 'latent_dim': 0, 'mask_length': 10, 'mask_prob': 0.65, 'mask_selection': 'static', 'mask_other': 0.0, 'no_mask_overlap': False, 'mask_min_space': 1, 'require_same_masks': True, 'mask_dropout': 0.0, 'mask_channel_length': 10, 'mask_channel_prob': 0.0, 'mask_channel_before': False, 'mask_channel_selection': 'static', 'mask_channel_other': 0.0, 'no_mask_channel_overlap': False, 'mask_channel_min_space': 1, 'num_negatives': 100, 'negatives_from_everywhere': False, 'cross_sample_negatives': 0, 'codebook_negatives': 0, 'conv_pos': 128, 'conv_pos_groups': 16, 'pos_conv_depth': 1, 'latent_temp': [2.0, 0.5, 0.999995], 'max_positions': 100000, 'checkpoint_activations': False, 'required_seq_len_multiple': 1, 'crop_seq_to_multiple': 1, 'depthwise_conv_kernel_size': 31, 'attn_type': '', 'pos_enc_type': 'abs', 'fp16': False}, 'task': Namespace(_name='translation_from_pretrained_bart', all_gather_list_size=16384, amp=False, amp_batch_retries=2, amp_init_scale=128, amp_scale_window=None, arch='wav2vec2', azureml_logging=False, batch_size=4, batch_size_valid=4, beam=10, best_checkpoint_metric='loss', bf16=False, bpe=None, broadcast_buffers=False, bucket_cap_mb=25, checkpoint_shard_count=1, checkpoint_suffix='', combine_valid_subsets=None, constraints=None, continue_once=None, cpu=False, cpu_offload=False, criterion='cross_entropy', curriculum=0, data='/content/drive/MyDrive/nl2code/projects/plbart/PLBART/data/codeXglue/text-to-code/concode/data-bin', data_buffer_size=10, dataset_impl=None, ddp_backend='pytorch_ddp', ddp_comm_hook='none', decoding_format=None, device_id=0, disable_validation=False, distributed_backend='nccl', distributed_init_method=None, distributed_no_spawn=False, distributed_num_procs=1, distributed_port=-1, distributed_rank=0, distributed_world_size=1, diverse_beam_groups=-1, diverse_beam_strength=0.5, diversity_rate=-1.0, empty_cache_freq=0, eos=2, eval_bleu=False, eval_bleu_args='{}', eval_bleu_detok='space', eval_bleu_detok_args='{}', eval_bleu_print_samples=False, eval_bleu_remove_bpe=None, eval_tokenized_bleu=False, fast_stat_sync=False, find_unused_parameters=False, finetune_from_model=None, fix_batches_to_gpus=False, fixed_validation_seed=None, force_anneal=None, fp16=False, fp16_init_scale=128, fp16_no_flatten_grads=False, fp16_scale_tolerance=0.0, fp16_scale_window=None, fp32_reduce_scatter=False, gen_subset='test', gradient_as_bucket_view=False, grouped_shuffling=False, heartbeat_timeout=-1, ignore_unused_valid_subsets=False, iter_decode_eos_penalty=0.0, iter_decode_force_max_iter=False, iter_decode_max_iter=10, iter_decode_with_beam=1, iter_decode_with_external_reranker=False, keep_best_checkpoints=-1, keep_interval_updates=-1, keep_interval_updates_pattern=-1, keep_last_epochs=-1, langs='java,python,en_XX', left_pad_source=True, left_pad_target=False, lenpen=1.0, lm_path=None, lm_weight=0.0, load_alignments=False, load_checkpoint_on_all_dp_ranks=False, localsgd_frequency=3, log_file=None, log_format=None, log_interval=100, lr_scheduler='fixed', lr_shrink=0.1, match_source_len=False, max_len_a=0, max_len_b=200, max_source_positions=1024, max_target_positions=1024, max_tokens=None, max_tokens_valid=None, max_valid_steps=None, maximize_best_checkpoint_metric=False, memory_efficient_bf16=False, memory_efficient_fp16=False, min_len=1, min_loss_scale=0.0001, model_overrides='{}', model_parallel_size=1, nbest=1, no_beamable_mm=False, no_early_stop=False, no_epoch_checkpoints=False, no_last_checkpoints=False, no_progress_bar=False, no_repeat_ngram_size=0, no_reshard_after_forward=False, no_save=False, no_save_optimizer_state=False, no_seed_provided=False, not_fsdp_flatten_parameters=False, nprocs_per_node=1, num_batch_buckets=0, num_shards=1, num_workers=1, on_cpu_convert_precision=False, optimizer=None, optimizer_overrides='{}', pad=1, path='/content/drive/MyDrive/nl2code/projects/plbart/PLBART/scripts/text_to_code/base/concode/checkpoint_best.pt', patience=-1, pipeline_balance=None, pipeline_checkpoint='never', pipeline_chunks=0, pipeline_decoder_balance=None, pipeline_decoder_devices=None, pipeline_devices=None, pipeline_encoder_balance=None, pipeline_encoder_devices=None, pipeline_model_parallel=False, plasma_path='/tmp/plasma', post_process='sentencepiece', prefix_size=0, prepend_bos=False, print_alignment=None, print_step=False, profile=False, quantization_config_path=None, quiet=False, replace_unk=None, required_batch_size_multiple=8, required_seq_len_multiple=1, reset_dataloader=False, reset_logging=False, reset_lr_scheduler=False, reset_meters=False, reset_optimizer=False, restore_file='checkpoint_last.pt', results_path=None, retain_dropout=False, retain_dropout_modules=None, retain_iter_history=False, sacrebleu=True, sampling=False, sampling_topk=-1, sampling_topp=-1.0, save_dir='checkpoints', save_interval=1, save_interval_updates=0, score_reference=False, scoring='bleu', seed=1, shard_id=0, skip_invalid_size_inputs_valid_test=False, slowmo_base_algorithm='localsgd', slowmo_momentum=None, source_lang='en_XX', suppress_crashes=False, target_lang='java', task='translation_from_pretrained_bart', temperature=1.0, tensorboard_logdir=None, threshold_loss_scale=None, tokenizer=None, tpu=False, train_subset='train', truncate_source=False, unk=3, unkpen=0, unnormalized=False, update_epoch_batch_itr=False, update_ordered_indices_seed=False, upsample_primary=-1, use_plasma_view=False, use_sharded_state=False, user_dir=None, valid_subset='valid', validate_after_updates=0, validate_interval=1, validate_interval_updates=0, wandb_project=None, warmup_updates=0, write_checkpoints_asynchronously=False, zero_sharding='none'), 'criterion': {'_name': 'cross_entropy', 'sentence_avg': True}, 'optimizer': None, 'lr_scheduler': {'_name': 'fixed', 'force_anneal': None, 'lr_shrink': 0.1, 'warmup_updates': 0, 'lr': [0.25]}, 'scoring': {'_name': 'bleu', 'pad': 1, 'eos': 2, 'unk': 3}, 'bpe': None, 'tokenizer': None, 'ema': {'_name': None, 'store_ema': False, 'ema_decay': 0.9999, 'ema_start_update': 0, 'ema_seed_model': None, 'ema_update_freq': 1, 'ema_fp32': False}}
2022-03-29 10:28:36 | INFO | fairseq.tasks.translation | [en_XX] dictionary: 50001 types
2022-03-29 10:28:36 | INFO | fairseq.tasks.translation | [java] dictionary: 50001 types
2022-03-29 10:28:36 | INFO | fairseq_cli.generate | loading model(s) from /content/drive/MyDrive/nl2code/projects/plbart/PLBART/scripts/text_to_code/base/concode/checkpoint_best.pt
Traceback (most recent call last):
  File "/usr/local/bin/fairseq-generate", line 8, in <module>
    sys.exit(cli_main())
  File "/usr/local/lib/python3.7/dist-packages/fairseq_cli/generate.py", line 413, in cli_main
    main(args)
  File "/usr/local/lib/python3.7/dist-packages/fairseq_cli/generate.py", line 50, in main
    return _main(cfg, sys.stdout)
  File "/usr/local/lib/python3.7/dist-packages/fairseq_cli/generate.py", line 102, in _main
    num_shards=cfg.checkpoint.checkpoint_shard_count,
  File "/usr/local/lib/python3.7/dist-packages/fairseq/checkpoint_utils.py", line 370, in load_model_ensemble
    state,
  File "/usr/local/lib/python3.7/dist-packages/fairseq/checkpoint_utils.py", line 419, in load_model_ensemble_and_task
    raise IOError("Model file not found: {}".format(filename))
OSError: Model file not found: /content/drive/MyDrive/nl2code/projects/plbart/PLBART/scripts/text_to_code/base/concode/checkpoint_best.pt
Traceback (most recent call last):
  File "evaluator.py", line 50, in <module>
    main()
  File "evaluator.py", line 26, in main
    assert len(preds) == len(gts), f"Samples of predictions and answers are not equal, {len(preds)}: {len(gts)}"
AssertionError: Samples of predictions and answers are not equal, 0: 2000
Traceback (most recent call last):
  File "/content/drive/MyDrive/nl2code/projects/plbart/PLBART/evaluation/CodeBLEU/calc_code_bleu.py", line 110, in <module>
    main()
  File "/content/drive/MyDrive/nl2code/projects/plbart/PLBART/evaluation/CodeBLEU/calc_code_bleu.py", line 85, in main
    assert len(hypothesis) == len(pre_references[i])
AssertionError
/content/drive/MyDrive/nl2code/projects/plbart/PLBART

Then I change the arguments: --min-lr to --stop-min-lr according to Command-line Tools in FairSeq's documentation. Unfortunately, another error:

/content/drive/MyDrive/nl2code/projects/plbart/PLBART/scripts/text_to_code
2022-03-29 10:40:42 | INFO | numexpr.utils | NumExpr defaulting to 2 threads.
2022-03-29 10:40:42 | INFO | fairseq.tasks.text_to_speech | Please install tensorboardX: pip install tensorboardX
2022-03-29 10:40:45 | ERROR | fairseq.dataclass.utils | Error when composing. Overrides: ['common.no_progress_bar=False', 'common.log_interval=100', "common.log_format='json'", 'common.log_file=null', 'common.tensorboard_logdir=null', 'common.wandb_project=null', 'common.azureml_logging=False', 'common.seed=1234', 'common.cpu=False', 'common.tpu=False', 'common.bf16=False', 'common.memory_efficient_bf16=False', 'common.fp16=False', 'common.memory_efficient_fp16=False', 'common.fp16_no_flatten_grads=False', 'common.fp16_init_scale=128', 'common.fp16_scale_window=null', 'common.fp16_scale_tolerance=0.0', 'common.on_cpu_convert_precision=False', 'common.min_loss_scale=0.0001', 'common.threshold_loss_scale=null', 'common.amp=False', 'common.amp_batch_retries=2', 'common.amp_init_scale=128', 'common.amp_scale_window=null', 'common.user_dir=null', 'common.empty_cache_freq=0', 'common.all_gather_list_size=16384', 'common.model_parallel_size=1', 'common.quantization_config_path=null', 'common.profile=False', 'common.reset_logging=False', 'common.suppress_crashes=False', 'common.use_plasma_view=False', "common.plasma_path='/tmp/plasma'", 'common_eval.path=null', 'common_eval.post_process=null', 'common_eval.quiet=False', "common_eval.model_overrides='{}'", 'common_eval.results_path=null', 'distributed_training.distributed_world_size=1', 'distributed_training.distributed_num_procs=1', 'distributed_training.distributed_rank=0', "distributed_training.distributed_backend='nccl'", 'distributed_training.distributed_init_method=null', 'distributed_training.distributed_port=-1', 'distributed_training.device_id=0', 'distributed_training.distributed_no_spawn=False', "distributed_training.ddp_backend='no_c10d'", "distributed_training.ddp_comm_hook='none'", 'distributed_training.bucket_cap_mb=25', 'distributed_training.fix_batches_to_gpus=False', 'distributed_training.find_unused_parameters=False', 'distributed_training.gradient_as_bucket_view=False', 'distributed_training.fast_stat_sync=False', 'distributed_training.heartbeat_timeout=-1', 'distributed_training.broadcast_buffers=False', 'distributed_training.slowmo_momentum=null', "distributed_training.slowmo_base_algorithm='localsgd'", 'distributed_training.localsgd_frequency=3', 'distributed_training.nprocs_per_node=1', 'distributed_training.pipeline_model_parallel=False', 'distributed_training.pipeline_balance=null', 'distributed_training.pipeline_devices=null', 'distributed_training.pipeline_chunks=0', 'distributed_training.pipeline_encoder_balance=null', 'distributed_training.pipeline_encoder_devices=null', 'distributed_training.pipeline_decoder_balance=null', 'distributed_training.pipeline_decoder_devices=null', "distributed_training.pipeline_checkpoint='never'", "distributed_training.zero_sharding='none'", 'distributed_training.fp16=False', 'distributed_training.memory_efficient_fp16=False', 'distributed_training.tpu=False', 'distributed_training.no_reshard_after_forward=False', 'distributed_training.fp32_reduce_scatter=False', 'distributed_training.cpu_offload=False', 'distributed_training.use_sharded_state=False', 'distributed_training.not_fsdp_flatten_parameters=False', 'dataset.num_workers=1', 'dataset.skip_invalid_size_inputs_valid_test=False', 'dataset.max_tokens=null', 'dataset.batch_size=8', 'dataset.required_batch_size_multiple=8', 'dataset.required_seq_len_multiple=1', 'dataset.dataset_impl=null', 'dataset.data_buffer_size=10', "dataset.train_subset='train'", "dataset.valid_subset='valid'", 'dataset.combine_valid_subsets=null', 'dataset.ignore_unused_valid_subsets=False', 'dataset.validate_interval=1', 'dataset.validate_interval_updates=0', 'dataset.validate_after_updates=0', 'dataset.fixed_validation_seed=null', 'dataset.disable_validation=False', 'dataset.max_tokens_valid=null', 'dataset.batch_size_valid=8', 'dataset.max_valid_steps=null', 'dataset.curriculum=0', "dataset.gen_subset='test'", 'dataset.num_shards=1', 'dataset.shard_id=0', 'dataset.grouped_shuffling=False', 'dataset.update_epoch_batch_itr=False', 'dataset.update_ordered_indices_seed=False', 'optimization.max_epoch=30', 'optimization.max_update=200000', 'optimization.stop_time_hours=0.0', 'optimization.clip_norm=0.0', 'optimization.sentence_avg=False', 'optimization.update_freq=[3]', 'optimization.lr=[5e-05]', 'optimization.stop_min_lr=-1.0', 'optimization.use_bmuf=False', 'optimization.skip_remainder_batch=False', "checkpoint.save_dir='/content/drive/MyDrive/nl2code/projects/plbart/PLBART/scripts/text_to_code/base/concode'", "checkpoint.restore_file='/content/drive/MyDrive/nl2code/projects/plbart/PLBART/pretrain/plbart_base.pt'", 'checkpoint.continue_once=null', 'checkpoint.finetune_from_model=null', 'checkpoint.reset_dataloader=True', 'checkpoint.reset_lr_scheduler=True', 'checkpoint.reset_meters=True', 'checkpoint.reset_optimizer=True', "checkpoint.optimizer_overrides='{}'", 'checkpoint.save_interval=1', 'checkpoint.save_interval_updates=0', 'checkpoint.keep_interval_updates=-1', 'checkpoint.keep_interval_updates_pattern=-1', 'checkpoint.keep_last_epochs=-1', 'checkpoint.keep_best_checkpoints=-1', 'checkpoint.no_save=False', 'checkpoint.no_epoch_checkpoints=True', 'checkpoint.no_last_checkpoints=False', 'checkpoint.no_save_optimizer_state=False', "checkpoint.best_checkpoint_metric='bleu'", 'checkpoint.maximize_best_checkpoint_metric=True', 'checkpoint.patience=3', "checkpoint.checkpoint_suffix=''", 'checkpoint.checkpoint_shard_count=1', 'checkpoint.load_checkpoint_on_all_dp_ranks=False', 'checkpoint.write_checkpoints_asynchronously=False', 'checkpoint.model_parallel_size=1', 'bmuf.block_lr=1.0', 'bmuf.block_momentum=0.875', 'bmuf.global_sync_iter=50', 'bmuf.warmup_iterations=500', 'bmuf.use_nbm=False', 'bmuf.average_sync=False', 'bmuf.distributed_world_size=1', 'generation.beam=5', 'generation.nbest=1', 'generation.max_len_a=0.0', 'generation.max_len_b=200', 'generation.min_len=1', 'generation.match_source_len=False', 'generation.unnormalized=False', 'generation.no_early_stop=False', 'generation.no_beamable_mm=False', 'generation.lenpen=1.0', 'generation.unkpen=0.0', 'generation.replace_unk=null', 'generation.sacrebleu=False', 'generation.score_reference=False', 'generation.prefix_size=0', 'generation.no_repeat_ngram_size=0', 'generation.sampling=False', 'generation.sampling_topk=-1', 'generation.sampling_topp=-1.0', 'generation.constraints=null', 'generation.temperature=1.0', 'generation.diverse_beam_groups=-1', 'generation.diverse_beam_strength=0.5', 'generation.diversity_rate=-1.0', 'generation.print_alignment=null', 'generation.print_step=False', 'generation.lm_path=null', 'generation.lm_weight=0.0', 'generation.iter_decode_eos_penalty=0.0', 'generation.iter_decode_max_iter=10', 'generation.iter_decode_force_max_iter=False', 'generation.iter_decode_with_beam=1', 'generation.iter_decode_with_external_reranker=False', 'generation.retain_iter_history=False', 'generation.retain_dropout=False', 'generation.retain_dropout_modules=null', 'generation.decoding_format=null', 'generation.no_seed_provided=False', 'eval_lm.output_word_probs=False', 'eval_lm.output_word_stats=False', 'eval_lm.context_window=0', 'eval_lm.softmax_batch=9223372036854775807', 'interactive.buffer_size=0', "interactive.input='-'", 'ema.store_ema=False', 'ema.ema_decay=0.9999', 'ema.ema_start_update=0', 'ema.ema_seed_model=null', 'ema.ema_update_freq=1', 'ema.ema_fp32=False', 'criterion=label_smoothed_cross_entropy', 'criterion._name=label_smoothed_cross_entropy', 'criterion.label_smoothing=0.2', 'criterion.report_accuracy=False', 'criterion.ignore_prefix_size=0', 'criterion.sentence_avg=False', 'bpe=sentencepiece', 'bpe._name=sentencepiece', "bpe.sentencepiece_model='/content/drive/MyDrive/nl2code/projects/plbart/PLBART/sentencepiece/sentencepiece.bpe.model'", 'bpe.sentencepiece_enable_sampling=False', 'bpe.sentencepiece_alpha=null', 'optimizer=adam', 'optimizer._name=adam', "optimizer.adam_betas='(0.9, 0.98)'", 'optimizer.adam_eps=1e-06', 'optimizer.weight_decay=0.01', 'optimizer.use_old_adam=False', 'optimizer.fp16_adam_stats=False', 'optimizer.tpu=False', 'optimizer.lr=[5e-05]', 'lr_scheduler=polynomial_decay', 'lr_scheduler._name=polynomial_decay', 'lr_scheduler.warmup_updates=1000', 'lr_scheduler.force_anneal=null', 'lr_scheduler.end_learning_rate=0.0', 'lr_scheduler.power=1.0', 'lr_scheduler.total_num_update=null', 'lr_scheduler.lr=[5e-05]', 'scoring=bleu', 'scoring._name=bleu', 'scoring.pad=1', 'scoring.eos=2', 'scoring.unk=3']
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/dist-packages/hydra/_internal/config_loader_impl.py", line 513, in _apply_overrides_to_config
    OmegaConf.update(cfg, key, value, merge=True)
  File "/usr/local/lib/python3.7/dist-packages/omegaconf/omegaconf.py", line 613, in update
    root.__setattr__(last_key, value)
  File "/usr/local/lib/python3.7/dist-packages/omegaconf/dictconfig.py", line 285, in __setattr__
    raise e
  File "/usr/local/lib/python3.7/dist-packages/omegaconf/dictconfig.py", line 282, in __setattr__
    self.__set_impl(key, value)
  File "/usr/local/lib/python3.7/dist-packages/omegaconf/dictconfig.py", line 266, in __set_impl
    self._set_item_impl(key, value)
  File "/usr/local/lib/python3.7/dist-packages/omegaconf/basecontainer.py", line 398, in _set_item_impl
    self._validate_set(key, value)
  File "/usr/local/lib/python3.7/dist-packages/omegaconf/dictconfig.py", line 143, in _validate_set
    self._validate_set_merge_impl(key, value, is_assign=True)
  File "/usr/local/lib/python3.7/dist-packages/omegaconf/dictconfig.py", line 159, in _validate_set_merge_impl
    cause=ValidationError("child '$FULL_KEY' is not Optional"),
  File "/usr/local/lib/python3.7/dist-packages/omegaconf/base.py", line 101, in _format_and_raise
    type_override=type_override,
  File "/usr/local/lib/python3.7/dist-packages/omegaconf/_utils.py", line 694, in format_and_raise
    _raise(ex, cause)
  File "/usr/local/lib/python3.7/dist-packages/omegaconf/_utils.py", line 610, in _raise
    raise ex  # set end OC_CAUSE=1 for full backtrace
omegaconf.errors.ValidationError: child 'lr_scheduler.total_num_update' is not Optional
    full_key: lr_scheduler.total_num_update
    reference_type=Optional[PolynomialDecayLRScheduleConfig]
    object_type=PolynomialDecayLRScheduleConfig

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/bin/fairseq-train", line 8, in <module>
    sys.exit(cli_main())
  File "/usr/local/lib/python3.7/dist-packages/fairseq_cli/train.py", line 522, in cli_main
    cfg = convert_namespace_to_omegaconf(args)
  File "/usr/local/lib/python3.7/dist-packages/fairseq/dataclass/utils.py", line 389, in convert_namespace_to_omegaconf
    composed_cfg = compose("config", overrides=overrides, strict=False)
  File "/usr/local/lib/python3.7/dist-packages/hydra/experimental/compose.py", line 37, in compose
    with_log_configuration=False,
  File "/usr/local/lib/python3.7/dist-packages/hydra/_internal/hydra.py", line 512, in compose_config
    from_shell=from_shell,
  File "/usr/local/lib/python3.7/dist-packages/hydra/_internal/config_loader_impl.py", line 156, in load_configuration
    from_shell=from_shell,
  File "/usr/local/lib/python3.7/dist-packages/hydra/_internal/config_loader_impl.py", line 277, in _load_configuration
    ConfigLoaderImpl._apply_overrides_to_config(config_overrides, cfg)
  File "/usr/local/lib/python3.7/dist-packages/hydra/_internal/config_loader_impl.py", line 522, in _apply_overrides_to_config
    ) from ex
hydra.errors.ConfigCompositionException: Error merging override lr_scheduler.total_num_update=null
2022-03-29 10:40:47 | INFO | fairseq.tasks.text_to_speech | Please install tensorboardX: pip install tensorboardX
2022-03-29 10:40:50 | INFO | fairseq_cli.generate | {'_name': None, 'common': {'_name': None, 'no_progress_bar': False, 'log_interval': 100, 'log_format': None, 'log_file': None, 'tensorboard_logdir': None, 'wandb_project': None, 'azureml_logging': False, 'seed': 1, 'cpu': False, 'tpu': False, 'bf16': False, 'memory_efficient_bf16': False, 'fp16': False, 'memory_efficient_fp16': False, 'fp16_no_flatten_grads': False, 'fp16_init_scale': 128, 'fp16_scale_window': None, 'fp16_scale_tolerance': 0.0, 'on_cpu_convert_precision': False, 'min_loss_scale': 0.0001, 'threshold_loss_scale': None, 'amp': False, 'amp_batch_retries': 2, 'amp_init_scale': 128, 'amp_scale_window': None, 'user_dir': None, 'empty_cache_freq': 0, 'all_gather_list_size': 16384, 'model_parallel_size': 1, 'quantization_config_path': None, 'profile': False, 'reset_logging': False, 'suppress_crashes': False, 'use_plasma_view': False, 'plasma_path': '/tmp/plasma'}, 'common_eval': {'_name': None, 'path': '/content/drive/MyDrive/nl2code/projects/plbart/PLBART/scripts/text_to_code/base/concode/checkpoint_best.pt', 'post_process': 'sentencepiece', 'quiet': False, 'model_overrides': '{}', 'results_path': None}, 'distributed_training': {'_name': None, 'distributed_world_size': 1, 'distributed_num_procs': 1, 'distributed_rank': 0, 'distributed_backend': 'nccl', 'distributed_init_method': None, 'distributed_port': -1, 'device_id': 0, 'distributed_no_spawn': False, 'ddp_backend': 'pytorch_ddp', 'ddp_comm_hook': 'none', 'bucket_cap_mb': 25, 'fix_batches_to_gpus': False, 'find_unused_parameters': False, 'gradient_as_bucket_view': False, 'fast_stat_sync': False, 'heartbeat_timeout': -1, 'broadcast_buffers': False, 'slowmo_momentum': None, 'slowmo_base_algorithm': 'localsgd', 'localsgd_frequency': 3, 'nprocs_per_node': 1, 'pipeline_model_parallel': False, 'pipeline_balance': None, 'pipeline_devices': None, 'pipeline_chunks': 0, 'pipeline_encoder_balance': None, 'pipeline_encoder_devices': None, 'pipeline_decoder_balance': None, 'pipeline_decoder_devices': None, 'pipeline_checkpoint': 'never', 'zero_sharding': 'none', 'fp16': False, 'memory_efficient_fp16': False, 'tpu': False, 'no_reshard_after_forward': False, 'fp32_reduce_scatter': False, 'cpu_offload': False, 'use_sharded_state': False, 'not_fsdp_flatten_parameters': False}, 'dataset': {'_name': None, 'num_workers': 1, 'skip_invalid_size_inputs_valid_test': False, 'max_tokens': None, 'batch_size': 4, 'required_batch_size_multiple': 8, 'required_seq_len_multiple': 1, 'dataset_impl': None, 'data_buffer_size': 10, 'train_subset': 'train', 'valid_subset': 'valid', 'combine_valid_subsets': None, 'ignore_unused_valid_subsets': False, 'validate_interval': 1, 'validate_interval_updates': 0, 'validate_after_updates': 0, 'fixed_validation_seed': None, 'disable_validation': False, 'max_tokens_valid': None, 'batch_size_valid': 4, 'max_valid_steps': None, 'curriculum': 0, 'gen_subset': 'test', 'num_shards': 1, 'shard_id': 0, 'grouped_shuffling': False, 'update_epoch_batch_itr': False, 'update_ordered_indices_seed': False}, 'optimization': {'_name': None, 'max_epoch': 0, 'max_update': 0, 'stop_time_hours': 0.0, 'clip_norm': 0.0, 'sentence_avg': False, 'update_freq': [1], 'lr': [0.25], 'stop_min_lr': -1.0, 'use_bmuf': False, 'skip_remainder_batch': False}, 'checkpoint': {'_name': None, 'save_dir': 'checkpoints', 'restore_file': 'checkpoint_last.pt', 'continue_once': None, 'finetune_from_model': None, 'reset_dataloader': False, 'reset_lr_scheduler': False, 'reset_meters': False, 'reset_optimizer': False, 'optimizer_overrides': '{}', 'save_interval': 1, 'save_interval_updates': 0, 'keep_interval_updates': -1, 'keep_interval_updates_pattern': -1, 'keep_last_epochs': -1, 'keep_best_checkpoints': -1, 'no_save': False, 'no_epoch_checkpoints': False, 'no_last_checkpoints': False, 'no_save_optimizer_state': False, 'best_checkpoint_metric': 'loss', 'maximize_best_checkpoint_metric': False, 'patience': -1, 'checkpoint_suffix': '', 'checkpoint_shard_count': 1, 'load_checkpoint_on_all_dp_ranks': False, 'write_checkpoints_asynchronously': False, 'model_parallel_size': 1}, 'bmuf': {'_name': None, 'block_lr': 1.0, 'block_momentum': 0.875, 'global_sync_iter': 50, 'warmup_iterations': 500, 'use_nbm': False, 'average_sync': False, 'distributed_world_size': 1}, 'generation': {'_name': None, 'beam': 10, 'nbest': 1, 'max_len_a': 0.0, 'max_len_b': 200, 'min_len': 1, 'match_source_len': False, 'unnormalized': False, 'no_early_stop': False, 'no_beamable_mm': False, 'lenpen': 1.0, 'unkpen': 0.0, 'replace_unk': None, 'sacrebleu': True, 'score_reference': False, 'prefix_size': 0, 'no_repeat_ngram_size': 0, 'sampling': False, 'sampling_topk': -1, 'sampling_topp': -1.0, 'constraints': None, 'temperature': 1.0, 'diverse_beam_groups': -1, 'diverse_beam_strength': 0.5, 'diversity_rate': -1.0, 'print_alignment': None, 'print_step': False, 'lm_path': None, 'lm_weight': 0.0, 'iter_decode_eos_penalty': 0.0, 'iter_decode_max_iter': 10, 'iter_decode_force_max_iter': False, 'iter_decode_with_beam': 1, 'iter_decode_with_external_reranker': False, 'retain_iter_history': False, 'retain_dropout': False, 'retain_dropout_modules': None, 'decoding_format': None, 'no_seed_provided': False}, 'eval_lm': {'_name': None, 'output_word_probs': False, 'output_word_stats': False, 'context_window': 0, 'softmax_batch': 9223372036854775807}, 'interactive': {'_name': None, 'buffer_size': 0, 'input': '-'}, 'model': {'_name': 'wav2vec2', 'extractor_mode': 'default', 'encoder_layers': 12, 'encoder_embed_dim': 768, 'encoder_ffn_embed_dim': 3072, 'encoder_attention_heads': 12, 'activation_fn': 'gelu', 'layer_type': 'transformer', 'dropout': 0.1, 'attention_dropout': 0.1, 'activation_dropout': 0.0, 'encoder_layerdrop': 0.0, 'dropout_input': 0.0, 'dropout_features': 0.0, 'final_dim': 0, 'layer_norm_first': False, 'conv_feature_layers': '[(512, 10, 5)] + [(512, 3, 2)] * 4 + [(512,2,2)] + [(512,2,2)]', 'conv_bias': False, 'logit_temp': 0.1, 'quantize_targets': False, 'quantize_input': False, 'same_quantizer': False, 'target_glu': False, 'feature_grad_mult': 1.0, 'quantizer_depth': 1, 'quantizer_factor': 3, 'latent_vars': 320, 'latent_groups': 2, 'latent_dim': 0, 'mask_length': 10, 'mask_prob': 0.65, 'mask_selection': 'static', 'mask_other': 0.0, 'no_mask_overlap': False, 'mask_min_space': 1, 'require_same_masks': True, 'mask_dropout': 0.0, 'mask_channel_length': 10, 'mask_channel_prob': 0.0, 'mask_channel_before': False, 'mask_channel_selection': 'static', 'mask_channel_other': 0.0, 'no_mask_channel_overlap': False, 'mask_channel_min_space': 1, 'num_negatives': 100, 'negatives_from_everywhere': False, 'cross_sample_negatives': 0, 'codebook_negatives': 0, 'conv_pos': 128, 'conv_pos_groups': 16, 'pos_conv_depth': 1, 'latent_temp': [2.0, 0.5, 0.999995], 'max_positions': 100000, 'checkpoint_activations': False, 'required_seq_len_multiple': 1, 'crop_seq_to_multiple': 1, 'depthwise_conv_kernel_size': 31, 'attn_type': '', 'pos_enc_type': 'abs', 'fp16': False}, 'task': Namespace(_name='translation_from_pretrained_bart', all_gather_list_size=16384, amp=False, amp_batch_retries=2, amp_init_scale=128, amp_scale_window=None, arch='wav2vec2', azureml_logging=False, batch_size=4, batch_size_valid=4, beam=10, best_checkpoint_metric='loss', bf16=False, bpe=None, broadcast_buffers=False, bucket_cap_mb=25, checkpoint_shard_count=1, checkpoint_suffix='', combine_valid_subsets=None, constraints=None, continue_once=None, cpu=False, cpu_offload=False, criterion='cross_entropy', curriculum=0, data='/content/drive/MyDrive/nl2code/projects/plbart/PLBART/data/codeXglue/text-to-code/concode/data-bin', data_buffer_size=10, dataset_impl=None, ddp_backend='pytorch_ddp', ddp_comm_hook='none', decoding_format=None, device_id=0, disable_validation=False, distributed_backend='nccl', distributed_init_method=None, distributed_no_spawn=False, distributed_num_procs=1, distributed_port=-1, distributed_rank=0, distributed_world_size=1, diverse_beam_groups=-1, diverse_beam_strength=0.5, diversity_rate=-1.0, empty_cache_freq=0, eos=2, eval_bleu=False, eval_bleu_args='{}', eval_bleu_detok='space', eval_bleu_detok_args='{}', eval_bleu_print_samples=False, eval_bleu_remove_bpe=None, eval_tokenized_bleu=False, fast_stat_sync=False, find_unused_parameters=False, finetune_from_model=None, fix_batches_to_gpus=False, fixed_validation_seed=None, force_anneal=None, fp16=False, fp16_init_scale=128, fp16_no_flatten_grads=False, fp16_scale_tolerance=0.0, fp16_scale_window=None, fp32_reduce_scatter=False, gen_subset='test', gradient_as_bucket_view=False, grouped_shuffling=False, heartbeat_timeout=-1, ignore_unused_valid_subsets=False, iter_decode_eos_penalty=0.0, iter_decode_force_max_iter=False, iter_decode_max_iter=10, iter_decode_with_beam=1, iter_decode_with_external_reranker=False, keep_best_checkpoints=-1, keep_interval_updates=-1, keep_interval_updates_pattern=-1, keep_last_epochs=-1, langs='java,python,en_XX', left_pad_source=True, left_pad_target=False, lenpen=1.0, lm_path=None, lm_weight=0.0, load_alignments=False, load_checkpoint_on_all_dp_ranks=False, localsgd_frequency=3, log_file=None, log_format=None, log_interval=100, lr_scheduler='fixed', lr_shrink=0.1, match_source_len=False, max_len_a=0, max_len_b=200, max_source_positions=1024, max_target_positions=1024, max_tokens=None, max_tokens_valid=None, max_valid_steps=None, maximize_best_checkpoint_metric=False, memory_efficient_bf16=False, memory_efficient_fp16=False, min_len=1, min_loss_scale=0.0001, model_overrides='{}', model_parallel_size=1, nbest=1, no_beamable_mm=False, no_early_stop=False, no_epoch_checkpoints=False, no_last_checkpoints=False, no_progress_bar=False, no_repeat_ngram_size=0, no_reshard_after_forward=False, no_save=False, no_save_optimizer_state=False, no_seed_provided=False, not_fsdp_flatten_parameters=False, nprocs_per_node=1, num_batch_buckets=0, num_shards=1, num_workers=1, on_cpu_convert_precision=False, optimizer=None, optimizer_overrides='{}', pad=1, path='/content/drive/MyDrive/nl2code/projects/plbart/PLBART/scripts/text_to_code/base/concode/checkpoint_best.pt', patience=-1, pipeline_balance=None, pipeline_checkpoint='never', pipeline_chunks=0, pipeline_decoder_balance=None, pipeline_decoder_devices=None, pipeline_devices=None, pipeline_encoder_balance=None, pipeline_encoder_devices=None, pipeline_model_parallel=False, plasma_path='/tmp/plasma', post_process='sentencepiece', prefix_size=0, prepend_bos=False, print_alignment=None, print_step=False, profile=False, quantization_config_path=None, quiet=False, replace_unk=None, required_batch_size_multiple=8, required_seq_len_multiple=1, reset_dataloader=False, reset_logging=False, reset_lr_scheduler=False, reset_meters=False, reset_optimizer=False, restore_file='checkpoint_last.pt', results_path=None, retain_dropout=False, retain_dropout_modules=None, retain_iter_history=False, sacrebleu=True, sampling=False, sampling_topk=-1, sampling_topp=-1.0, save_dir='checkpoints', save_interval=1, save_interval_updates=0, score_reference=False, scoring='bleu', seed=1, shard_id=0, skip_invalid_size_inputs_valid_test=False, slowmo_base_algorithm='localsgd', slowmo_momentum=None, source_lang='en_XX', suppress_crashes=False, target_lang='java', task='translation_from_pretrained_bart', temperature=1.0, tensorboard_logdir=None, threshold_loss_scale=None, tokenizer=None, tpu=False, train_subset='train', truncate_source=False, unk=3, unkpen=0, unnormalized=False, update_epoch_batch_itr=False, update_ordered_indices_seed=False, upsample_primary=-1, use_plasma_view=False, use_sharded_state=False, user_dir=None, valid_subset='valid', validate_after_updates=0, validate_interval=1, validate_interval_updates=0, wandb_project=None, warmup_updates=0, write_checkpoints_asynchronously=False, zero_sharding='none'), 'criterion': {'_name': 'cross_entropy', 'sentence_avg': True}, 'optimizer': None, 'lr_scheduler': {'_name': 'fixed', 'force_anneal': None, 'lr_shrink': 0.1, 'warmup_updates': 0, 'lr': [0.25]}, 'scoring': {'_name': 'bleu', 'pad': 1, 'eos': 2, 'unk': 3}, 'bpe': None, 'tokenizer': None, 'ema': {'_name': None, 'store_ema': False, 'ema_decay': 0.9999, 'ema_start_update': 0, 'ema_seed_model': None, 'ema_update_freq': 1, 'ema_fp32': False}}
2022-03-29 10:40:50 | INFO | fairseq.tasks.translation | [en_XX] dictionary: 50001 types
2022-03-29 10:40:50 | INFO | fairseq.tasks.translation | [java] dictionary: 50001 types
2022-03-29 10:40:50 | INFO | fairseq_cli.generate | loading model(s) from /content/drive/MyDrive/nl2code/projects/plbart/PLBART/scripts/text_to_code/base/concode/checkpoint_best.pt
Traceback (most recent call last):
  File "/usr/local/bin/fairseq-generate", line 8, in <module>
    sys.exit(cli_main())
  File "/usr/local/lib/python3.7/dist-packages/fairseq_cli/generate.py", line 413, in cli_main
    main(args)
  File "/usr/local/lib/python3.7/dist-packages/fairseq_cli/generate.py", line 50, in main
    return _main(cfg, sys.stdout)
  File "/usr/local/lib/python3.7/dist-packages/fairseq_cli/generate.py", line 102, in _main
    num_shards=cfg.checkpoint.checkpoint_shard_count,
  File "/usr/local/lib/python3.7/dist-packages/fairseq/checkpoint_utils.py", line 370, in load_model_ensemble
    state,
  File "/usr/local/lib/python3.7/dist-packages/fairseq/checkpoint_utils.py", line 419, in load_model_ensemble_and_task
    raise IOError("Model file not found: {}".format(filename))
OSError: Model file not found: /content/drive/MyDrive/nl2code/projects/plbart/PLBART/scripts/text_to_code/base/concode/checkpoint_best.pt
Traceback (most recent call last):
  File "evaluator.py", line 50, in <module>
    main()
  File "evaluator.py", line 26, in main
    assert len(preds) == len(gts), f"Samples of predictions and answers are not equal, {len(preds)}: {len(gts)}"
AssertionError: Samples of predictions and answers are not equal, 0: 2000
Traceback (most recent call last):
  File "/content/drive/MyDrive/nl2code/projects/plbart/PLBART/evaluation/CodeBLEU/calc_code_bleu.py", line 110, in <module>
    main()
  File "/content/drive/MyDrive/nl2code/projects/plbart/PLBART/evaluation/CodeBLEU/calc_code_bleu.py", line 85, in main
    assert len(hypothesis) == len(pre_references[i])
AssertionError
/content/drive/MyDrive/nl2code/projects/plbart/PLBART

I really don't know what to do this time.

My Environment

pytorch-1.10.0+cu111 google colab

wasiahmad commented 2 years ago

I am sorry, I do not have an idea about this error. The code is tested using the Conda environment (https://github.com/wasiahmad/PLBART/blob/main/install_env.sh) in GPU servers. Please make sure the API versions are consistent with what we have used to avoid any unexpected errors.