SiavashShams / ssamba

[SLT'24] The official implementation of SSAMBA: Self-Supervised Audio Representation Learning with Mamba State Space Model
BSD 3-Clause "New" or "Revised" License
104 stars 7 forks source link

masked patch #2

Closed leyuan-sun closed 5 months ago

leyuan-sun commented 5 months ago

What is the purpose of masked patch?

SiavashShams commented 5 months ago

The purpose of masked patches during pretraining is to enable the model to learn robust audio representations by predicting the masked portions of the spectrogram input, similar to how BERT works for text.