facebookresearch / av_hubert

A self-supervised learning framework for audio-visual speech
Other
842 stars 136 forks source link

Generating adversarial examples for av_hubert #56

Closed ashwath98 closed 2 years ago

ashwath98 commented 2 years ago

Hi, I am interested in generating adversarial examples for av hubert as I believe it will be more robust than other methods because of how it has been trained.

Because of my unfamiliarity with the fairseq framework, it's not completely clear to me where the model gets input during prediction (like in the colab notebook). Could you point me to the relevant part of the code base that defines the task?

-Ashwath

chevalierNoir commented 2 years ago

Hi,

The data loading is defined here. Here are the model forward functions during pre-training and fine-tuning.

ashwath98 commented 2 years ago

Hey, thanks I was able to generate adversarial examples, surprisingly only a 0.01 epsilon led to a completely wrong sentence for the collab test video (I have attached an image of the sample difference between frames). I am trying out adversarial training and generating some visualizations for what the network is looking at to investigate further. If you have any suggestions for tools, I can use for visualizing which pixels the network is paying importance to, that will be immensely helpful.

thanks -Ashwath

original adversarial_sample h