Open lijipu1 opened 8 months ago
hello lijipu1, thanks for reporting this evaluation issue.
Yes, the actual audio transformer model under tf.2
and kapre
had a slight difference on the model we released there. I used to run a hyperparameter search on the dropout rate over the prompt noise. This is another difference also. Due to the version conflict issue in tf.2
, it is now harder to run under keras.
If you are finding the number for future research, please feel free to refer to this issue on the 93 to 96% acc in Ford A with current codebase. A better V2S result could be attained by using AST based PyTorch backend [1]. Sorry for the confusion.
see this issue also on 94% acc https://github.com/huckiyang/Voice2Series-Reprogramming/issues/2
Does main.py use the V2Sa or V2Su network architecture? I noticed that the network structure in main.py differs from that in the article. Could this difference be the reason for the discrepancy in accuracy?