csabaiBio / elte_ml_journal_club

Machine learning journal club
https://csabaibio.github.io/elte_ml_journal_club/
6 stars 0 forks source link

2023.06.08. #131

Closed bbeatrix closed 9 months ago

bbeatrix commented 1 year ago

Dear All!

I completely forgot to write earlier, but as the summer has arrived, we agreed upon having meetings only when there are volunteering presenters. Let us know under this issue if any of you is willing to present something at some point. If there are no volunteering presenters, each week's journal club is automatically canceled, I won't write separate notifications for that.

Best wishes and restful summer! Bea

ozkilim commented 11 months ago

Multimodal Neurons in Pretrained Text-Only Transformers

https://arxiv.org/pdf/2308.01544.pdf

"In 1688, William Molyneux posed a philosophical riddle to John Locke that has remained relevant to vision science for centuries: would a blind person, immediately upon gain- ing sight, visually recognize objects previously known only through another modality, such as touch [24, 30]? A pos- itive answer to the Molyneux Problem would suggest the existence a priori of ‘amodal’ representations of objects, common across modalities. In 2011, vision neuroscien- tists first answered this question in human subjects—no, im- mediate visual recognition is not possible—but crossmodal recognition capabilities are learned rapidly, within days af- ter sight-restoring surgery [15]. More recently, language- only artificial neural networks have shown impressive per- formance on crossmodal tasks when augmented with addi- tional modalities such as vision, using techniques that leave pretrained transformer weights frozen [40, 7, 25, 28, 18]."

Image prompts cast into the transformer embedding space do not encode interpretable semantics. Translation between modalities occurs inside the transformer.