Living-with-machines / TargetedSenseDisambiguation

Repository for the work on Targeted Sense Disambiguation
MIT License
1 stars 0 forks source link

reading group 17/11 #39

Open BarbaraMcG opened 3 years ago

BarbaraMcG commented 3 years ago

Next time: Analysing Lexical Semantic Change with Contextualised Word Representations Mario Giulianelli | Marco Del Tredici | Raquel Fernández

BarbaraMcG commented 3 years ago

@kasparvonbeelen @mcollardanuy @GiorgiatolfoBL please add your suggestions!

kasparvonbeelen commented 3 years ago

I will discuss this paper: SenseBERT: Driving Some Sense into BERT, Yoav Levine, Barak Lenz, Or Dagan, Ori Ram, Dan Padnos, Or Sharir, Shai Shalev-Shwartz, Amnon Shashua, Yoav Shoham https://www.aclweb.org/anthology/2020.acl-main.423/

mcollardanuy commented 3 years ago

I will discuss this one: Simple, Interpretable and Stable Method for Detecting Words with Usage Change across Corpora. Hila Gonen, Ganesh Jawahar, Djamé Seddah, Yoav Goldberg https://www.aclweb.org/anthology/2020.acl-main.51.pdf

BarbaraMcG commented 3 years ago

I will discuss this one: Simple, Interpretable and Stable Method for Detecting Words with Usage Change across Corpora. Hila Gonen, Ganesh Jawahar, Djamé Seddah, Yoav Goldberg https://www.aclweb.org/anthology/2020.acl-main.51.pdf

Nice! I presented this one at a Turing NLP reading group (Giorgia was there), but it's nice to revisit for everyone else

BarbaraMcG commented 3 years ago

@kasparvonbeelen and @mcollardanuy Could you please add a short summary of the papers, maybe following this template? https://github.com/Living-with-machines/HistoricalDictionaryExpansion/issues/35

kasparvonbeelen commented 3 years ago

SenseBERT: Driving Some Sense into BERT, Yoav Levine, Barak Lenz, Or Dagan, Ori Ram, Dan Padnos, Or Sharir, Shai Shalev-Shwartz, Amnon Shashua, Yoav Shoham https://www.aclweb.org/anthology/2020.acl-main.423/

  1. What is this paper about?

The paper demonstrates how self-supervision can infuse BERT with richer semantic representations. It is focussed on the pre-training phase. Besides the traditional masked language modelling, the authors train BERT to predict the WordNet supersense of a masked token. Even a token may map to different supersenses, the context will fit some supersenses better than others, e.g. words appearing in the sentence this is [MASK] delicious will likely be of category noun.food. This proxy or soft-labelling technique allows for self-supervision.

The authors show the SenseBERT representation capture and improve performance on downstream tasks.

  1. Is it relevant to our project? If so, why and how?

Yes, we could extend/adapt the method to OED thesaurus, and attempt the predict parents of the semantic class of a sense.

We could assess if this type of pre-training improves the WSD tasks, or generate better vector representation.

More interestingly even, we could improve pretraining by adapting the sentence level predictions procedure, see if BERT can predict aspects of a sentence's metadata (e.g. year and author). Now BERT only attempts to predict the next sentence.

mcollardanuy commented 3 years ago

Simple, Interpretable and Stable Method for Detecting Words with Usage Change across Corpora. Hila Gonen, Ganesh Jawahar, Djamé Seddah, Yoav Goldberg https://www.aclweb.org/anthology/2020.acl-main.51.pdf

1. What is this paper about?

This paper provides a way to detect words that are used differently in different datasets: it is a simple approach that is more stable and easier to interpret than the traditional alignment-based method of embedding spaces (Hamilton et al 2016). It uses differences in the nearest neighbours of words in the datasets as a proxy to detect usage change of a certain word.

2. Is it relevant to our project? If so, why and how?

Yes, it is relevant to the task of detecting semantic change (not only through time).

3. What could we use from this work in our project?

This can be one of the approaches we can use to observe changes of usage of the machine words, given two different datasets (i.e. a dataset split by time, by type of article, by place of publication, etc): it has the advantage that it is very interpretable and is more stable than the traditional alignment-based approach. It would be interesting if it could be used in combination with the WSD word, or as a next step.

Their github repository seems well documented.

4. Add some text about it to Overleaf

I have added one sentence for now, but I'm not sure yet how this will fit with the rest of the paper.

5. Plan experiments (if relevant)

To discuss later.

BarbaraMcG commented 3 years ago

Thank you, @kasparvonbeelen and @mcollardanuy ! @fedenanni , feel free to have a look if you're curious

fedenanni commented 3 years ago

@BarbaraMcG ehi! thank you all for this, I'll check it now!

fedenanni commented 3 years ago

I really like these 5 points overviews, they are super useful also for us that cannot always join the readings! (I'll tag @kasra-hosseini as well in case he's not already in the thread)