Open ehuppert opened 3 years ago
Thank you for coming to our workshop to present such interesting work! James often talks about doing "psychology on machines", which I fully did not get until now, but I think this is a nice demonstration of doing psycholinguistics on machines. It was really insightful for me!
I often heard the phrase that "BERT is not so interesting until you actually fine-tune it on your purpose" or something similar to these. Do you think BERT can actually learn the cases where it fails (incorporating relatively indirect, complex context especially in the first sentence and understanding negation) if we actually train on these specific tasks (I guess it will be hard to create a dataset to train). My intuition is that maybe the majority of our language is plain and simple, without the complex context and negation (negation is not complex, but it makes the sentence more indirect, and I think will be much less prevalent than direct sentences), so BERT was not trained to take into that information fully on the masked training set that did not control for such. But maybe BERT has the capacity to exploit that context (I guess a human analogy will be a child that picks up these fast after some learning) if enough training is given to it. Do you think this will be the case, or do you think BERT might not perform too well even if we fine-tune it a little bit more?
Thanks for the sharing. Looking forward to your presentation.
Thank you Dr. Ettinger for presenting your fascinating research!
I was really enjoyed reading your work, especially the thematization of specifics types of errors we can expect BERT-like models to make. In the particular case of the ROLE-88 test, the swapping of roles certainly poses an interesting challenge. As part of Prof. Gimpel's NLP course at TTIC, I built a sentence paraphrase detector using a BERT-based LSTM. It did pretty well even on cases of role reversal (a transformation such as 'Dr. Ettinger came to the workshop' vs 'the workshop came to Dr. Ettinger' was accurately detected as not being a paraphrase pair).
It seems to me that this type of task requires the sentence to infer context not just based on the two roles, but also based on additional anchor words that can be used to for context. In some sense, the algorithm needs to be able to tell that the roles are on 'opposite sides' of some relationship indicator (excuse my unfamiliarity with linguistic terminology).
In that context, I was curious about the role of the self-attention mechanism. As attention uses a softmax on weighted word vectors, is it possible that at most one other word is typically given importance as a result of the attention weighting? If that were the case, it would seem that BERT might lack the ability to 'triangulate' relationships that require 'attention' on triplet-wise, instead of pairwise relationships.
Hopefully this wasn't too contrived of a question. Looking forward to your talk!
I look forward to your presentation, Dr. Ettinger, thank you for your time!
Thank you Dr. Ettinger for presenting at our workshop. While I'm not a linguistics researcher myself, my ex-girlfriend worked in your lab and talked about your research often and in high esteem.
This is somewhat off topic but it has to do tangentially with BERT. The GLUE and SuperGLUE frameworks have been highly cited as measurements of language model effectiveness. At the same time, the media and the culture (via the internet, which I know isn't a perfect lens into the culture) have been obsessed with attention models, especially the GPT family, often holding them in the highest regard of language models. Yet, at the same time, the highest rated language model on the SuperGLUE test is a BERT-Derived model. (DeBERTa from Microsoft Research holds the highest validated score, and T5 from google holds the highest score overall.) GPT-family models don't even meet the human benchmark. I'm curious as to your thoughts about the so-called attention revolution (which even though i work in vision, has leaked into my work as well, I've seen challengers to my visual memory models that use attention in what I believe to be epistemically problematic ways).
Thanks again for coming to our workshop!
Thanks for your sharing in advance! My question is: does NLP models have the same excellent performance in other language, like Spanish, Chinese?
Thank you for this interesting research! I found your conclusion about the difference between the goals and methods of language processing between humans and models very interesting. You characterize this distinction as one between optimizing prediction (models) and discovering meaning/truth (humans) - I would be curious to hear how you think this will impact artificial intelligence or other efforts to mimic human behavior (i.e. pass a Turing test)?
Thanks for your sharing! When I do stuff like N-gram splitting on texts, it does not always make sense, so I'm very curious about how linguists look at NLP. Looking forward to this talk!
Dear Dr. Ettinger, thank you very much for sharing your work with us! It's really great to see your unique perspective as a linguistics expert in NLP! Sometimes I feel like some NLP papers have similar approaches to explore the feature of different corpora, those approaches including cleaning texts, counting word frequencies, word cloud, TFiDF, (dynamic) Topic Modeling, Part-of-Speech tagging, conditional probability and so on. And these methods can work on different data, such as Twitter, Reddit, Instagram, etc.
But I feel it's kind of hard to come up with a creative method to answer a research question except for those exploration methods and the article could be just a listing of facts. For example, I personally use a corpus from Personal Finance subreddit to explore people's top personal finance concerns with topic modeling. But how to make the research more creative instead of just following the tradition? Your article is definitely a creative one so I would like to get some of your advice. Thank you so much!
Thank you for coming to our workshop to present such interesting work! I found your conclusion about the difference between the goals and methods of language processing between humans and models very interesting. It’ll be very interesting to hear linguistic scholar talking about NLP, and I wish to learn from you on tips and cautions about using NLP. Looking forward to this talk!
Thank you very much for sharing your work with us! I am looking forward to your presentation about using NLP in linguistic study!
Thanks a lot for sharing. I am interested in the diagnostics of nlp language models from a psychological perspective. Looking forward to your talk!
Thanks for sharing. I am also curious about if what you found could be applied to other languages? Thanks!
Thanks for sharing your work! This is a very intereseting pespective. Do you think this 'behavioural' approach to study complex models will become more popular as the complexity of these models increase? I am curious if this line of research will be a temporary substitute for analytical studies due to constraints in analytical tools or will it be a stream of fundamentally important science. Precise prediction of behaviours of even the most simple form of organism could be difficult due to nonlinear transformations and measurement errors. I suspect this might be the case for artificial computation models as hardware capacities increase and such a 'behavioural' pespective will be necessary/inevitable. Looking forward to your thoughts!
Thanks for sharing this wonderful work! Look forward to listening to your presentation tomorrow!
Thank you for your research! It is interesting to learn about language models and how we can generate predictions in context. I am excited to learn more about your work within the field of linguistics, and I look forward to your presentation!
Thank you for sharing this presentation! I am very interested in how NLP predicts and how linguistic patterns are analyzed. Could you talk more about how different languages could be predicted by the NLP tools?
Thanks for your sharing. Looking forward to your talk tomorrow!
Thank you for sharing!
It is amazing to see that how linguistics could be embedded in research about NLP. In your research, you apply diagnostics that drawn from human language experiments to the popular BERT model, suggesting that linguistical experiments based on human behaviors could add to computer science models. Thus, I have a broad question about how to use linguistics to help the development of NLP (such as contribute to developing NLP-related computer science models).
Looking forward to your presentation!
Thanks for coming! I wonder how the model can be applied to other fields.
Thank you for coming! I wonder how the model can be applied to economics.
Thank you for sharing your work with us! It is fascinating that BERT is not yet equipped to capture the meaning of words in negation. Is this difficulty also observed to when a negative is used (e.g., 'impossible'), rather than negating ('not possible')? On the same note, how does the model work with double negatives ('not impossible')?
Thank you for sharing your work! I am very interested in how you tackle NLP problems where robots model human behaviors to process the natural language people use. I am looking forward to hearing your presentation! Thank you
thank you for sharing! I look forward to your presentation!
Thank you for sharing. Looking forward to your presentation tomorrow!
Thanks very much for presenting this research, I'm looking forward to hearing more about it tomorrow! I'd be curious to hear if/how you think about connecting your research of whatever language/sensemaking skills these NLP models have with both natural development of language and meaning (say, in early childhood) and with language/sensemaking as a product of cultural evolution.
Thank you for your presentation. It is impressive to see the deep analysis and comparison between machine's understanding and prediction power. I am wondering what feasible ways you would suggest to improve the real understanding of a machine or an algorithm? Thank you.
Thank you for presenting. As you mentioned there was an exception about grammatical continuation, how do you interpret this result and why this could happen? Thanks!
Thanks for sharing your research!
Thanks very much for yore presentation! I also have some experience in making use of NLP algorithms and find some drawbacks that really needs to be improved. Do you think that your research about improving machine's understanding can help solve these problems?
Thanks for sharing your work!
Thanks for sharing Professor Ettinger!
I'm excited for your presentation! Do you think such NPL processes might have some use in ConLangs, or would there not be the data size required?
Thank you very much for sharing your work with us. The paper clearly identifies the strengths and limitations of BERT model, which may indicate that the application of BERT should undergo serious context examination. I was wondering if you have any general suggestions regarding the application of LMs and BERT in order to avoid their limitations? Thanks.
Looking forward to hearing your talk tomorrow! Learning about the pros and cons of BERT models / their applications will be very helpful for my current projects, which make use of this method
Thank you so much for presenting at our workshop, Professor Ettinger! Your work on BERT probing is amazing. I just have one thing I would like your opinion on: can BERT help with the recovery of historical texts? Especially those with language much more ancient? I'm pretty sure some people has been attempting this, but I would love to hear your thoughts on the feasibility, potential obstacles of the research route.
Thanks for the research! I'm also curious about the use of NLP in other languages. Thanks!
Dear Professor Ettinger, It's really exciting to have you here with us! As a psychology student, I am also interested in how psycholinguistics can be applied in NLP area. I was wondering how may the difference in languages influence the NLP modeling process? Do you think it necessary to conduct similar experiment with different languages to test the conclusion? I guess Chinese can be very different from English, since the grammar, structure and the use of words can be very different.
Thank you so much for sharing your work! I am wondering how do you think your NLP models can apply in other fields?
Looking forward your presentation
I look forward to your presentation, Dr. Ettinger, thank you for your time!
Thank you for presenting at our workshop, Dr. Ettinger! I think I'll perpetually regret not being to take one of your classes in my time here.
I had one question in addition to @SoyBison's valuable one: What do you make of the recent MLP-based papers which seem to outperform attention-based models? Is attention yet another fad (from the standpoint of a naive bystander like me) that is soon going to be on its way out?
Hi Dr. Ettinger, thanks for sharing your interesting research! Although not familiar with the BERT model, I thought such pre-trained NLP models from only plain text corpus is much accurate and adaptive than learning from pre-determined or well-classified dictionaries. From your study I learnt that the BERT model could generally distinguish good from bad completions involving shared category or role reversal, although less sensitive than human, it's robust on within-category distinctions and role reversals. Based on this fact, I was wondering if you have any ideas about how to incorporate more human-like decisions in the BERT model or other NLP models, so as to improve the understanding of deeper structure of the texts, like negation. Thanks for your presentation!
THank you very much for your research! I really look forward to your presentation!!!
Thank you so much for sharing your work Dr Ettinger! Looking forward to the presentation.
Thank for sharing with us about you excellent work! That really helps us to understand the mechanism behind the BERT. I look forward to your presentation.
Thank you for sharing this interesting research with us. Understanding what LMs know about language by introducing psycholinguistic tests is very inspiring! Looking forward to your presentation!
Thanks for the presentation, Professor Ettinger! My question is about pre-trained language models. I wonder from your perspective, how much the performance of these language models is associated with the size of training corpus?
Thank you for sharing such interesting topic with us! I realized that NLP has become another trend for many social science research, and I was wondering how we can get started with NLP as a novice. Thank you!
Thank you for sharing your work with us. Looking forward to your presentation tomorrow!
Thank you for your presentation! My question is what do you think of the future improvements of NLP models? What can we do to help them capture more compositional meanings? Thanks again for your amazing paper!
Comment below with questions or thoughts about the reading for this week's workshop.
Please make your comments by Wednesday 11:59 PM, and upvote at least five of your peers' comments on Thursday prior to the workshop. You need to use 'thumbs-up' for your reactions to count towards 'top comments,' but you can use other emojis on top of the thumbs up.