Open jamesallenevans opened 4 years ago
The authors strive for grammatical purity. They put much effort into “abstracting away from general statistical regularities” (p. 1792) to “combat [the] problem that [...] many existing evaluation data sets contain biases that allow for high performance based on superficial cues” (p. 1791). But are the authors not overcorrecting? The examples of generated sentences provided in table 1 (p. 1795) are grammatically correct, but it is almost impossible to imagine a situation in which regular speakers/authors would use them, e.g., “the doctors that helped the lawyers are being recommended by the student.” Most grammatically possible sentences will never be formed by human beings. The superficial cues may hence not be so superficial at all.
I really like the "lexicalized case frame" the authors employ in creating the event representations of sentences. They remind of abstract syntax trees in programming, and I think that the authors are right when they say that these are good semantic representations.
Maybe I'm misunderstanding the point of the article, but why did they prune away the grammatically incorrect sentences (leaving them with the grammatically correct, but as @ckoerner648 noted, weird sentences in Fig. 2)? Are they taking grammar to the be the ground truth algorithm of language?
It would be awesome to have Dr. Ettinger speak to the class (or the MACS workshop 🙂).
Like @ckoerner648 mentioned, there is the danger of overcorrecting in the generation model, but I would argue that the effects of bias are perhaps a greater danger that should be prioritized. The diminishing of bias in a dataset might justify inaccuracy with overcorrection, or is it perhaps, necessary to strike a balance between the two?
This was a great paper to read, and stands in stark contrast to the OpenAI GPT-2 paper for its focus on interpretability and the extraction of meaning. It's interesting to note that they adapted techniques from cognitive neuroscience and psycholinguistics to interrogate the accuracy and structure of neural network methods.
How do BERT, ELMo and GPT-2 perform on the set of tests proposed in this paper? For me, these tests could provide greater insight about how these models parse meaning, even with sentence structures that are rarely used (like those in Table 2).
I like how Ettinger et al. probe the possibility of teasing the meaning of sentences which seems to be overlooked in models like embeddings. My question is on the application of their approach to texts that social scientists are interested in. What would be the example of applications, say in Sociological research?
I like the idea to add semantic(syntax?) structure in one sentence into the model.
Ettinger et al.’s work seems to complement the other works we read this week in that their work aims to increase the interpretability of neural network models and to provide explanations of model predictions. Thus, it will be very interesting to apply their model to neural network models such as LSTM and to see how these models learn the composition of sentences. They underscore the importance of composition in meaning extraction of sentences. The authors distinguish their work which focus on systematic compositional processes from those that suffer from biases based on general statistical regularities (1791).
This article used a well-controlled experiment to test the ‘semantic’ capability of different sentence embedding models. However, it did not show promising results that sentence embedding models failed to figure out whether the subject is the ‘agent’ or not. I like this study very much that it gives detailed considerations of confounding variables; while I am still very shocked by the results that even the most advanced sentence embeddings could not extract the simple ‘agent’/‘subject’ relationship in the sentence.
I think technically this paper tried to make us understand the importance of the variations that happened in NLP, which reveals a current drawback of most NLP methods: unable to capture the relationship or build human-like logic connections between words or BOW. The author proposed a system with annotated sentences to reduce this kind of inability to some extent, so I hope to know if we can find other approaches to understand the connections by deductions rather than annotated information?
Ettinger, A., Elgohary, A., Phillips, C., Resnik, P. (2018). “Assessing Composition in Sentence Vector Representations.” Proceedings of the 27th International Conference on Computational Linguistics.