Open HuangChiEn opened 1 year ago
the same question with Audio. While inverse_mask is an important role in paper, the "model.modalities.audio.inverse_mask" in "example/data2vec/config/v2/base&large_audio_only_task.yaml" is false in default official code.
❓ Questions and Help
Before asking:
This issue should be mentioned in data2vec v2 paper explicitly, instead of roughly explane in few phase. So, there have no sufficient info in document (paper) .
What is your question?
Why the inverse mask trick can "enable the student model to build semantically rich representations over local regions of the sample". Since the masking ratio (MR) and preserving ration (PR) is fixed!! (1-MR = PR) No matter what you implement it should be the same, isn't it ? then why inverse mask trick works ?
Code
Besides, only the vision config have inverse mask option, the other modality potentially support this (i guess). For example, the text modality just directly keep the preserved part. So, we can have a quick review :
What have you tried?
read the code and paper..
What's your environment?
not important..