YuanGongND / ast

Code for the Interspeech 2021 paper "AST: Audio Spectrogram Transformer".
BSD 3-Clause "New" or "Revised" License
1.17k stars 221 forks source link

some questions when reproducing your results #131

Open ben100118 opened 6 months ago

ben100118 commented 6 months ago

Hello Yuan,I'm delighted to read your paper and reproduce your work.And I encounter some problems. When empolying the audioset_pretrain,why does the stride as same as the patch_size(overlap == 0). In addition, when I train ast on speechCommands,the acc is lower than your result(value 0.1) Thank you for taking the time to read my question; I look forward to your response!

ben100118 commented 6 months ago

Furthermore,what's the value of mixup?

YuanGongND commented 5 months ago

0.1 must be some bug.

Please check the reference of https://arxiv.org/abs/2102.01243 this paper (i.e., search mixup and find the related paper). This is a data augmentation method.