Closed FiRepRooFC closed 10 months ago
It will default to the first one indeed. But you can circumvent this by providing the character-span of the word you want! for instance, if you want the most recent occurrence of 'one', you could do this by:
import re
sentence = 'There are two books. The red one is mine, and the other one is yours.'
span = list(re.finditer(r'one', sentence))[-1].span()
print(span)
#> (56, 59)
# extract reps:
model.extract_representation(['There are two books. The red one is mine, and the other one is yours.', span], layer = 12)
does this help/make sense?
Thank you so much! It helps me a lot. Here is another question. I am doing a psychological study on ambiguous words, and I am wondering which layer of BERT provides lexical representations, excluding higher-level representations such as grammar or position information. Do you have any suggestions? :)
I asked around a bit and found the following:
In this paper the authors found that the middle layers of bert base are best at predicting word similarity. that is, paradigmatic (wordnetlike) relations between words. https://aclanthology.org/2020.conll-1.17/
And this kind of echoes that the middle layers are better at semantics and the later layers better at syntax, e.g. https://arxiv.org/abs/1905.05950
Unsure if these help, but you could perhaps also determine this empirically on your data! Try with multiple layers if you can!
Hi! I have a question about extracting word representations. If the sentence has two target words, for example, "There are two books. One is mine, the other one is yours.", when using model.extract_representation(['There are two books. The red one is mine, and the other one is yours.', 'one'], layer = 12), which representation in extracted? Is the representation of the first "one"? Many thanks!