NVIDIA / tacotron2

Tacotron 2 - PyTorch implementation with faster-than-realtime inference
BSD 3-Clause "New" or "Revised" License
5.11k stars 1.39k forks source link

"Start training!" Issue with #560

Open BorisKalashnikov opened 2 years ago

BorisKalashnikov commented 2 years ago

follow each step put the changed the batch size to 5-6 since I'm only using 9 wav files. Did everything after this until "Start training!" where I got this error:

FP16 Run: False Dynamic Loss Scaling: True Distributed Run: False cuDNN Enabled: True cuDNN Benchmark: False % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 1555 100 1555 0 0 101k 0 --:--:-- --:--:-- --:--:-- 101k Warm starting model from checkpoint 'pretrained_model'

UnpicklingError Traceback (most recent call last) in () 5 print('cuDNN Benchmark:', hparams.cudnn_benchmark) 6 train(output_directory, log_directory, checkpoint_path, ----> 7 warm_start, n_gpus, rank, group_name, hparams, log_directory2)

3 frames /usr/local/lib/python3.7/dist-packages/torch/serialization.py in _legacy_load(f, map_location, pickle_module, pickle_load_args) 775 "functionality.") 776 --> 777 magic_number = pickle_module.load(f, pickle_load_args) 778 if magic_number != MAGIC_NUMBER: 779 raise RuntimeError("Invalid magic number; corrupt file?")

UnpicklingError: invalid load key, '<'.

I'm not a big of a hacker man if there is a fix or if I messed with the numbers too much tell me please to fix this tomfooleryish issue also I'm not using any pretrained models

rakeshkhoodeeram commented 2 years ago

Same here. FP16 Run: False Dynamic Loss Scaling: True Distributed Run: False cuDNN Enabled: True cuDNN Benchmark: False % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 1555 100 1555 0 0 2796 0 --:--:-- --:--:-- --:--:-- 2791 Warm starting model from checkpoint 'pretrained_model'

UnpicklingError Traceback (most recent call last) in () 5 print('cuDNN Benchmark:', hparams.cudnn_benchmark) 6 train(output_directory, log_directory, checkpoint_path, ----> 7 warm_start, n_gpus, rank, group_name, hparams, log_directory2)

3 frames /usr/local/lib/python3.7/dist-packages/torch/serialization.py in _legacy_load(f, map_location, pickle_module, pickle_load_args) 918 "functionality.") 919 --> 920 magic_number = pickle_module.load(f, pickle_load_args) 921 if magic_number != MAGIC_NUMBER: 922 raise RuntimeError("Invalid magic number; corrupt file?")

UnpicklingError: invalid load key, '<'.