WeidiXie / VGG-Speaker-Recognition

Utterance-level Aggregation For Speaker Recognition In The Wild
362 stars 98 forks source link

ValueError: Layer #125 (named "gvlad_center_assignment"), weight <tf.Variable 'gvlad_center_assignment/kernel:0' shape=(7, 1, 512, 12) dtype=float32_ref> has shape (7, 1, 512, 12), but the saved weight has shape (10, 512, 7, 1). #46

Closed IvanEvan closed 4 years ago

IvanEvan commented 4 years ago

Thaks for your greate sharing ! ! ! The pre-training model can deal with identity coding effectively. But when I change to my data to fine-tune this pre-training model, I got a error. My python command line is : python main.py --net resnet34s --batch_size 3 --gpu 0 --lr 0.001 --warmup_ratio 0.1 --optimizer adam --epochs 20 --multiprocess 4 --loss softmax --data_path '' The Error is: Traceback (most recent call last): File "main.py", line 212, in <module> main() File "main.py", line 84, in main network.load_weights(os.path.join(args.resume), by_name=True) File "/usr/local/anaconda3/envs/py3.6/lib/python3.6/site-packages/keras/engine/network.py", line 1163, in load_weights reshape=reshape) File "/usr/local/anaconda3/envs/py3.6/lib/python3.6/site-packages/keras/engine/saving.py", line 1149, in load_weights_from_hdf5_group_by_name str(weight_values[i].shape) + '.') ValueError: Layer #125 (named "gvlad_center_assignment"), weight <tf.Variable 'gvlad_center_assignment/kernel:0' shape=(7, 1, 512, 12) dtype=float32_ref> has shape (7, 1, 512, 12), but the saved weight has shape (10, 512, 7, 1). I've tried reshape, but doesn't work. I have no idea for it. So how can I fix it?

WeidiXie commented 4 years ago

if you want to load the pertained model, use --ghost_cluster 2 --vlad_cluster 8

Otherwise, if you want to increase the number of Vlad clusters, do: load_weights(os.path.join(args.resume), by_name=True, skip_mismatch=True)