olvrhhn / audio_super_resolution

TU Darmstadt - Deep Learning: Architectures & Methods Project SS21
28 stars 4 forks source link

RuntimeError: Error(s) in loading state_dict for TFILMUNet #2

Closed bubblegg closed 2 years ago

bubblegg commented 2 years ago

python inference.py --save-example --wav-file-list list.txt --scale 4 --sr 22000 --dimension 8192 --stride 16384 --checkpoint 1.pth Namespace(dataset_type='gtzan', dataset_root='/home/bubble/Documents/audio_super_resolution-master/datasets/gtzan/blues-val.4.22000.8192.16384.h5', full_root='/home/bubble/Documents/audio_super_resolution-master/datasets/gtzan/blues-val.4.22000.8192.16384.h5', dataset_split='val', save_examples=True, wav_file_list='list.txt', scale=4, dimension=8192, stride=16384, sr=22000, checkpoints_root='/home/bubble/Documents/audio_super_resolution-master/checkpoints/runs', checkpoint='1.pth', batch_size=1, num_workers=1, method='base', mode='inf') Run Inference on example files: Traceback (most recent call last): File "/home/bubble/Documents/audio_super_resolution-master/inference.py", line 286, in <module> run_examples(clargs) File "/home/bubble/Documents/audio_super_resolution-master/inference.py", line 158, in run_examples model.load_state_dict(torch.load(checkpoint, map_location=device), strict=True) File "/home/bubble/anaconda3/envs/audiosr/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1223, in load_state_dict raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format( RuntimeError: Error(s) in loading state_dict for TFILMUNet: Missing key(s) in state_dict: "tfilm_d1.lstm.weight_ih_l0_reverse", "tfilm_d1.lstm.weight_hh_l0_reverse", "tfilm_d1.lstm.bias_ih_l0_reverse", "tfilm_d1.lstm.bias_hh_l0_reverse", "tfilm_d2.lstm.weight_ih_l0_reverse", "tfilm_d2.lstm.weight_hh_l0_reverse", "tfilm_d2.lstm.bias_ih_l0_reverse", "tfilm_d2.lstm.bias_hh_l0_reverse", "tfilm_d3.lstm.weight_ih_l0_reverse", "tfilm_d3.lstm.weight_hh_l0_reverse", "tfilm_d3.lstm.bias_ih_l0_reverse", "tfilm_d3.lstm.bias_hh_l0_reverse", "tfilm_d4.lstm.weight_ih_l0_reverse", "tfilm_d4.lstm.weight_hh_l0_reverse", "tfilm_d4.lstm.bias_ih_l0_reverse", "tfilm_d4.lstm.bias_hh_l0_reverse", "tfilm_b.lstm.weight_ih_l0_reverse", "tfilm_b.lstm.weight_hh_l0_reverse", "tfilm_b.lstm.bias_ih_l0_reverse", "tfilm_b.lstm.bias_hh_l0_reverse", "tfilm_u4.lstm.weight_ih_l0_reverse", "tfilm_u4.lstm.weight_hh_l0_reverse", "tfilm_u4.lstm.bias_ih_l0_reverse", "tfilm_u4.lstm.bias_hh_l0_reverse", "tfilm_u3.lstm.weight_ih_l0_reverse", "tfilm_u3.lstm.weight_hh_l0_reverse", "tfilm_u3.lstm.bias_ih_l0_reverse", "tfilm_u3.lstm.bias_hh_l0_reverse", "tfilm_u2.lstm.weight_ih_l0_reverse", "tfilm_u2.lstm.weight_hh_l0_reverse", "tfilm_u2.lstm.bias_ih_l0_reverse", "tfilm_u2.lstm.bias_hh_l0_reverse", "tfilm_u1.lstm.weight_ih_l0_reverse", "tfilm_u1.lstm.weight_hh_l0_reverse", "tfilm_u1.lstm.bias_ih_l0_reverse", "tfilm_u1.lstm.bias_hh_l0_reverse". Flags i used: python inference.py --save-example --wav-file-list list.txt --scale 4 --sr 22000 --dimension 8192 --stride 16384 --checkpoint 1.pth

Checkpoint used: GTZAN

Dataset prepared for gtzan

Am I doing something wrong?

olvrhhn commented 2 years ago

Please take a look here. When I implemented the model I have been playing around with the LSTMs but it had no real influence on the final result. You may have to adjust these lines at one of the checkpoints to load the weights.

Screenshot 2022-01-22 at 16 21 43
bubblegg commented 2 years ago

I'm sorry as I am really new to all of this. Changing these lines in tfilmunet.py didn't make any difference, I don't understand what am I supposed to do with the checkpoint file?

olvrhhn commented 2 years ago

Hey @bubblegg! Sorry I had no time to check the code myself during the last days. Just checked it again. When not changing the model it will work with the vctk speaker checkpoint. Accordingly, the other two checkpoints work after you comment in the red lines in the image posted above and comment out the green ones. The speaker training takes quite a long time which is why I did not train it again. The modification to the LSTMs unfortunately had little effect on the result. Hope I could help you or you have already solved the issue yourself!