Closed wahyubram82 closed 4 years ago
I Already Have the answer...
the problem is the fairseq, I don't know why, not save the arguments in the model. the argument that we use to build pre-trainned.
in my case, i use this command:
python3 /content/repo/fairseq/train.py '/content/drive/My Drive/wav_manifest/' \
--save-dir '/content/drive/My Drive/wav2vec_v2_pre_train_model' \
--fp16 --num-workers 128 --task audio_pretraining --criterion wav2vec --arch wav2vec2 \
--log-keys '["prob_perplexity","code_perplexity","temp"]' --quantize-targets \
--extractor-mode default --conv-feature-layers '[(512, 10, 5)] + [(512, 3, 2)] * 4 + [(512,2,2)] * 2' \
--final-dim 256 --latent-vars 320 --latent-groups 2 --latent-temp '(2,0.5,0.999995)' --infonce \
--optimizer adam --adam-betas '(0.9,0.98)' --adam-eps 1e-06 --lr-scheduler polynomial_decay \
--total-num-update 400000 --lr 0.0005 --warmup-updates 32000 --mask-length 10 \
--mask-prob 0.65 --mask-selection static --mask-other 0 --encoder-layerdrop 0.05 --dropout-input 0.1 \
--dropout-features 0.1 --feature-grad-mult 0.1 --loss-weights '[0.1, 10]' --conv-pos 128 --conv-pos-groups 16 \
--num-negatives 100 --cross-sample-negatives 0 --max-sample-size 1500000 --no-epoch-checkpoints \
--min-sample-size 2000 --dropout 0.1 --attention-dropout 0.1 --weight-decay 0.01 --max-tokens 1400000 \
--max-update 400000 --skip-invalid-size-inputs-valid-test --ddp-backend no_c10d
it should be have args:
Namespace(activation_dropout=0.0, activation_fn='gelu', adam_betas='(0.9,0.98)', adam_eps=1e-06, all_gather_list_size=16384, arch='wav2vec2', attention_dropout=0.1, batch_size=None, batch_size_valid=None, best_checkpoint_metric='loss', bf16=False, bpe=None, broadcast_buffers=False, bucket_cap_mb=25, checkpoint_shard_count=1, checkpoint_suffix='', clip_norm=25.0, codebook_negatives=0, conv_bias=False, conv_feature_layers='[(512, 10, 5)] + [(512, 3, 2)] * 4 + [(512,2,2)] * 2', conv_pos=128, conv_pos_groups=16, cpu=False, criterion='wav2vec', cross_sample_negatives=0, curriculum=0, data='/content/drive/My Drive/wav_manifest/', data_buffer_size=10, dataset_impl=None, ddp_backend='no_c10d', device_id=0, disable_validation=False, distributed_backend='nccl', distributed_init_method=None, distributed_no_spawn=False, distributed_port=-1, distributed_rank=0, distributed_world_size=1, distributed_wrapper='DDP', dropout=0.1, dropout_features=0.1, dropout_input=0.1, empty_cache_freq=0, enable_padding=False, encoder_attention_heads=12, encoder_embed_dim=768, encoder_ffn_embed_dim=3072, encoder_layerdrop=0.05, encoder_layers=12, end_learning_rate=0.0, extractor_mode='default', fast_stat_sync=False, feature_grad_mult=0.1, final_dim=256, find_unused_parameters=False, finetune_from_model=None, fix_batches_to_gpus=False, fixed_validation_seed=None, force_anneal=None, fp16=True, fp16_init_scale=128, fp16_no_flatten_grads=False, fp16_scale_tolerance=0.0, fp16_scale_window=None, gen_subset='test', infonce=True, keep_best_checkpoints=-1, keep_interval_updates=-1, keep_last_epochs=-1, labels=None, latent_dim=0, latent_groups=2, latent_temp='(2,0.5,0.999995)', latent_vars=320, layer_norm_first=False, local_rank=0, localsgd_frequency=3, log_format=None, log_interval=100, log_keys='["prob_perplexity","code_perplexity","temp"]', logit_temp=0.1, loss_weights='[0.1, 10]', lr=[0.0005], lr_scheduler='polynomial_decay', mask_channel_length=10, mask_channel_min_space=1, mask_channel_other=0, mask_channel_prob=0, mask_channel_selection='static', mask_length=10, mask_min_space=1, mask_other=0.0, mask_prob=0.65, mask_selection='static', max_epoch=0, max_sample_size=1500000, max_tokens=1400000, max_tokens_valid=1400000, max_update=400000, maximize_best_checkpoint_metric=False, memory_efficient_bf16=False, memory_efficient_fp16=False, min_loss_scale=0.0001, min_lr=-1.0, min_sample_size=2000, model_parallel_size=1, negatives_from_everywhere=False, no_epoch_checkpoints=True, no_last_checkpoints=False, no_mask_channel_overlap=False, no_mask_overlap=False, no_progress_bar=False, no_save=False, no_save_optimizer_state=False, no_seed_provided=False, normalize=False, nprocs_per_node=1, num_negatives=100, num_shards=1, num_workers=128, optimizer='adam', optimizer_overrides='{}', patience=-1, pipeline_balance=None, pipeline_checkpoint='never', pipeline_chunks=0, pipeline_decoder_balance=None, pipeline_decoder_devices=None, pipeline_devices=None, pipeline_encoder_balance=None, pipeline_encoder_devices=None, pipeline_model_parallel=False, power=1.0, profile=False, quantization_config_path=None, quantize_input=False, quantize_targets=True, required_batch_size_multiple=8, required_seq_len_multiple=1, reset_dataloader=False, reset_lr_scheduler=False, reset_meters=False, reset_optimizer=False, restore_file='checkpoint_last.pt', same_quantizer=False, sample_rate=16000, save_dir='/content/drive/My Drive/wav2vec_v2_pre_train_model', save_interval=1, save_interval_updates=0, scoring='bleu', seed=1, sentence_avg=False, shard_id=0, skip_invalid_size_inputs_valid_test=True, slowmo_algorithm='LocalSGD', slowmo_momentum=None, stop_time_hours=0, target_glu=False, task='audio_pretraining', tensorboard_logdir=None, threshold_loss_scale=None, tokenizer=None, total_num_update=400000, tpu=False, train_subset='train', update_freq=[1], use_bmuf=False, use_old_adam=False, user_dir=None, valid_subset='valid', validate_after_updates=0, validate_interval=1, validate_interval_updates=0, warmup_updates=32000, weight_decay=0.01, zero_sharding='none')
so to make it have the args. do this...
first.. you must save you command when build your own pre-trainned model.... like i did in like I mention above.
then..
import torch, argparse, logging, os, sys
from fairseq import options
# I did this by manual splitting as list, base your command in build pre-trainned model
cek = [
'/content/repo/fairseq/train.py',
'/content/drive/My Drive/wav_manifest/',
'--save-dir',
'/content/drive/My Drive/wav2vec_v2_pre_train_model',
'--fp16',
'--num-workers',
'128',
'--task',
'audio_pretraining',
'--criterion',
'wav2vec',
'--arch',
'wav2vec2',
'--log-keys',
'["prob_perplexity","code_perplexity","temp"]',
'--quantize-targets',
'--extractor-mode',
'default',
'--conv-feature-layers',
'[(512, 10, 5)] + [(512, 3, 2)] * 4 + [(512,2,2)] * 2',
'--final-dim',
'256', '--latent-vars',
'320',
'--latent-groups',
'2',
'--latent-temp',
'(2,0.5,0.999995)',
'--infonce', '--optimizer',
'adam',
'--adam-betas',
'(0.9,0.98)',
'--adam-eps',
'1e-06',
'--lr-scheduler',
'polynomial_decay',
'--total-num-update',
'400000',
'--lr',
'0.0005',
'--warmup-updates',
'32000',
'--mask-length',
'10',
'--mask-prob',
'0.65',
'--mask-selection',
'static',
'--mask-other',
'0',
'--encoder-layerdrop',
'0.05',
'--dropout-input',
'0.1',
'--dropout-features',
'0.1',
'--feature-grad-mult',
'0.1',
'--loss-weights',
'[0.1, 10]',
'--conv-pos',
'128',
'--conv-pos-groups',
'16',
'--num-negatives',
'100',
'--cross-sample-negatives',
'0',
'--max-sample-size',
'1500000',
'--no-epoch-checkpoints',
'--min-sample-size',
'2000',
'--dropout',
'0.1',
'--attention-dropout',
'0.1',
'--weight-decay',
'0.01',
'--max-tokens',
'1400000',
'--max-update',
'400000',
'--skip-invalid-size-inputs-valid-test',
'--ddp-backend',
'no_c10d'
]
sys.argv = cek
logging.basicConfig(
format="%(asctime)s | %(levelname)s | %(name)s | %(message)s",
datefmt="%Y-%m-%d %H:%M:%S",
level=os.environ.get("LOGLEVEL", "INFO").upper(),
stream=sys.stdout,
)
logger = logging.getLogger("fairseq_cli.train")
parser = options.get_training_parser()
args = options.parse_args_and_arch(parser, modify_parser=None)
model_path = 'checkpoint_best.pt
mymodel = torch.load(model_path, map_location=torch.device('cpu'))
mymodel['args'] = args
torch.save(mymodel, 'new_fixed_model.pt')
after that you can load this new_fixed_model.pt
to fine-tuning.
problem solve
❓ Questions and Help
I'm have problem in fine tuning the command that i use base on fine-tuning command README.md
the error report:
the folder
/home/bram/Documents/coding/speech/traindata/text_label
contains:the folder
/home/bram/Documents/coding/speech/traindata/model_finetuning_wav2vec
is empty means to save the finetuning model that resulting in finetuning processthe folder
/home/bram/Documents/coding/speech/traindata/w2v2_pre_traned_model/
is contained:this file resulting from pre-trainned process from own dataset..command of pre-trained:
train in google colab...
I already try to debugging, by folllow the process, step by step, and found where the error happen, but cannot solve the problem.
the error happen in File
fairseq/fairseq/checkpoint_utils.py
, line 211, in load_checkpoint_to_cpu function.I try to reproduce the step. here the report:
line 201
def load_checkpoint_to_cpu(path, arg_overrides=None):
this function call by the/fairseq/fairseq/models/wav2vec/wav2vec2_asr.py
line 330. before it, there is command to reproduce thearg_overrides
variable. which thearg_overrides
variable now is:and the path is
args.w2v_path
that is'/home/bram/Documents/coding/speech/traindata/w2v2_pre_traned_model/checkpoint_best.pt'
in come because we set option--w2v-path
.so the function
def load_checkpoint_to_cpu(path, arg_overrides=None):
define variablepath = '/home/bram/Documents/coding/speech/traindata/w2v2_pre_traned_model/checkpoint_best.pt'
ok, then...
with open(PathManager.get_local_path(path), "rb") as f:
in line 203 of filefairseq/fairseq/checkpoint_utils.py
means call to read the filecheckpoint_best.pt
and define it asf
variable.error happens when executing:
state = torch.load(f, map_location=lambda s, l: default_restore_location(s, "cpu"))
in line 204 filefairseq/fairseq/checkpoint_utils.py
.the error report said
'utf-8' codec can't decode byte 0x80 in position 64: invalid start byte
meaning that torch cannot load /read thecheckpoint_best.pt
.any suggestion, can somebody help? if I can fix this I will continues the tutorial.