Closed huacilang closed 2 years ago
Hi there,
The Transformer is created by the timm package. https://github.com/YuanGongND/ast/blob/d338ce48b4861e419ee62c9ecad499cfd548e54b/src/models/ast_models.py#L67
Specifically, the Transformer is in ast_mdl.v.blocks
. https://github.com/YuanGongND/ast/blob/d338ce48b4861e419ee62c9ecad499cfd548e54b/src/models/ast_models.py#L176
-Yuan
thanks , i need to learn more about timm 。 another question, sigmoid activation is mentioned in the paper,but i still didn't find it in the code ,Can you help me point out ?
We use torch.nn.BCEWithLogitsLoss
, which contains Sigmoid.
https://github.com/YuanGongND/ast/blob/d338ce48b4861e419ee62c9ecad499cfd548e54b/src/traintest.py#L65
-Yuan
ok,thanks i have finished training on Speech Commands and the result is higher than the paper said,Can you help me take a look at is there any problem? parameters are as follows: model_size=base384,epoch=10,lr=2.5e-4,batch-size=128 and result are as follows: ---------------evaluate on the validation set--------------- Accuracy: 0.972748 AUC: 0.999579 ---------------the evaluation dataloader--------------- now using following mask: 0 freq, 0 time now using mix-up with rate 0.000000 now process speechcommands use dataset mean -6.846 and std 5.565 to normalize the input number of classes is 35 ---------------evaluate on the test set--------------- Accuracy: 0.974739 AUC: 0.999688
so ,what result should i concern ,AUC or Accuracy?
You should look at the accuracy, so the number you get is lower than the paper (~0.981). There are many things that could impact the performance, but the most obvious thing I noticed is that your epoch (10) is smaller than what we used in the recipe (30). Could you try to use the exact same hyper-parameters and have another try? Thanks.
We have included our training log, so you can easily compare the performance of each epoch with us (the log shows that at the 10th epoch our validation acc is also ~0.974, so it seems your setting is correct but just need more training).
Finally, it is normal that you get slightly better/worse numbers than that is reported in the paper because 1) there's some noise in training, so the result differs with random seeds, we report mean/std in the paper so you might get a number higher/lower than the mean value; 2) I did some minor modifications that could lead to a slightly better number.
-Yuan
Btw, I would appreciate it if you could bring up new issues for new questions because other people can easier find the issue. Thanks!
ok ,Thank you a lot for spending so much time to answer questions for me, I am much clearer now, thanks again
You are welcome, please let me know if you can get the accuracy reported in the paper. Thanks!
hello,i get the Accuracy: 0.981281,haha....,thanks again
Thanks so much for letting me know. Yes, it is exactly the same as what we got in our log...
hello , I didn't find the transformer or attention in the ATSModel ,Can you help me point out ?