How do you parse nouns from narrations?

epic-kitchens / epic-kitchens-55-annotations

🍴 Annotations for the EPIC KITCHENS-55 Dataset.

https://epic-kitchens.github.io/2020

Other

150 stars 26 forks source link

How do you parse nouns from narrations? #26

Closed avijit9 closed 3 years ago

avijit9 commented 3 years ago

Hi,

I am curious to know the exact method you used to gather nouns from narrations. I have used the following code:


nlp = spacy.load('en') 
sentence =  "place bottle"
doc = nlp(sentence)
pos = [tag for tag in doc.noun_chunks]

However, the place gets incorrectly classified as a Noun. How do you overcome this?

mwray commented 3 years ago

Hi,

In our code we normally check the pos attribute for each of the tags in the doc instead of looping through the noun chunks as we found this was better performing in the Spacy models we used. Additionally, a simple way we found to increase accuracy of part of speech tagging was to prepend I to every sentence, so "place bottle" becomes "I place bottle":

>>> nlp = spacy.load('en_core_web_sm')
>>> sentence = "place bottle"
>>> doc = nlp("I " + sentence)
>>> nouns = [tag for tag in doc if tag.pos_ == 'NOUN']
>>> nouns
[bottle]

Hope this helps

avijit9 commented 3 years ago

Ah! I get it now. You did the same thing for your CVPR'19 paper (related to POS-embedding), right?

mwray commented 3 years ago

Yes, exactly.

iranroman commented 2 years ago

Can you please clarify whether this is the same methodology you used with the Epic-Kitchens-100 dataset.

Or can you elaborate if there are details missing? Thank you!

mwray commented 2 years ago

Yes we largely used the same process for EPIC-Kitchens-100 as we did for EPIC-Kitchens-55. The only things we changed was how we parsed compound nouns which we added manual rules for to better extract these.

Note that the annotations for EPIC-Kitchens-55 have been re-parsed in the EPIC-Kitchens-100 dataset so there may be slight differences between the two.