huckiyang / Voice2Series-Reprogramming

ICML 21 - Voice2Series: Adversarial Reprogramming Acoustic Models for Time Series Classification
Apache License 2.0
69 stars 11 forks source link

The article reported a prediction accuracy of 100% for the ford-A dataset, while the V2S_main.py script yielded only 93%. #4

Open lijipu1 opened 8 months ago

lijipu1 commented 8 months ago

Does main.py use the V2Sa or V2Su network architecture? I noticed that the network structure in main.py differs from that in the article. Could this difference be the reason for the discrepancy in accuracy?

huckiyang commented 5 months ago

hello lijipu1, thanks for reporting this evaluation issue.

Yes, the actual audio transformer model under tf.2 and kapre had a slight difference on the model we released there. I used to run a hyperparameter search on the dropout rate over the prompt noise. This is another difference also. Due to the version conflict issue in tf.2, it is now harder to run under keras.

If you are finding the number for future research, please feel free to refer to this issue on the 93 to 96% acc in Ford A with current codebase. A better V2S result could be attained by using AST based PyTorch backend [1]. Sorry for the confusion.

  1. See a later version speech reprogramming / wav-form prompting code here https://github.com/biboamy/music-repro

see this issue also on 94% acc https://github.com/huckiyang/Voice2Series-Reprogramming/issues/2