RetroCirce / HTS-Audio-Transformer

The official code repo of "HTS-AT: A Hierarchical Token-Semantic Audio Transformer for Sound Classification and Detection"
https://arxiv.org/abs/2202.00874
MIT License
341 stars 62 forks source link

Does this framework's output have been compared with other features? #35

Open MisakaMikoto96 opened 1 year ago

MisakaMikoto96 commented 1 year ago

Does this framework's output have been compared with other features like wav2vec, hubert?

RetroCirce commented 1 year ago

Hi,

No really, because HTS-AT itself is our proposed audio transformer, in this paper, we just use it for audio classification and SED tasks. But we use this HTS-AT architecture in other tasks, such as contrastive language-audio pretraining, CLAP. We compare this audio representation with other TF-domain SoTA. I think wav2vec can be compared, even though we did not conduct such experimenes before.