YuanGongND / ast

Code for the Interspeech 2021 paper "AST: Audio Spectrogram Transformer".
BSD 3-Clause "New" or "Revised" License
1.06k stars 203 forks source link

some questions when reproducing your results #131

Open ben100118 opened 1 month ago

ben100118 commented 1 month ago

Hello Yuan,I'm delighted to read your paper and reproduce your work.And I encounter some problems. When empolying the audioset_pretrain,why does the stride as same as the patch_size(overlap == 0). In addition, when I train ast on speechCommands,the acc is lower than your result(value 0.1) Thank you for taking the time to read my question; I look forward to your response!

ben100118 commented 1 month ago

Furthermore,what's the value of mixup?

YuanGongND commented 5 days ago

0.1 must be some bug.

Please check the reference of https://arxiv.org/abs/2102.01243 this paper (i.e., search mixup and find the related paper). This is a data augmentation method.