eeyhsong / NICE-EEG

[ICLR 2024] M/EEG-based image decoding with contrastive learning. i. Propose a contrastive learning framework to align image and eeg. ii. Resolving brain activity for biological plausibility.
https://arxiv.org/abs/2308.13234
MIT License
95 stars 15 forks source link

EEG-Image feature alignment #13

Open wlc1256630 opened 2 weeks ago

wlc1256630 commented 2 weeks ago

The work you have done in this article is sufficient and has made a great contribution to the further development of this field, but I have a question: I saw in the NICE article that the image encoder used is the same pre-trained CLIP model. How can we ensure that the image information extracted by the image encoder is close to the image information that the human eye focuses on when collecting EEG signals? After all, the EEG signal collected using the RSVP paradigm, the image sequence will flash by very quickly, or this problem has no effect on the image-EEG feature alignment, so I have some questions in this regard

eeyhsong commented 2 weeks ago

Hello, @wlc1256630, sorry for the late reply. 1) It's a fascinating question. It's more like a results-driven conclusion the CLIP could be used to obtain image features, that are consistent with our visual systems. 2) SOA=200 ms is enough for visual processing to object recognition. You may see https://www.youtube.com/watch?v=JhpvpHlfPlE&t=46s. Actually, I think that's a balance between stimulus time and data scale. Another issue is that the pre-a nd post- stimulus would have interference.