Closed avijit9 closed 3 years ago
Hi,
In our code we normally check the pos attribute for each of the tags in the doc instead of looping through the noun chunks as we found this was better performing in the Spacy models we used. Additionally, a simple way we found to increase accuracy of part of speech tagging was to prepend I to every sentence, so "place bottle" becomes "I place bottle":
>>> nlp = spacy.load('en_core_web_sm')
>>> sentence = "place bottle"
>>> doc = nlp("I " + sentence)
>>> nouns = [tag for tag in doc if tag.pos_ == 'NOUN']
>>> nouns
[bottle]
Hope this helps
Ah! I get it now. You did the same thing for your CVPR'19 paper (related to POS-embedding), right?
Yes, exactly.
Can you please clarify whether this is the same methodology you used with the Epic-Kitchens-100 dataset.
Or can you elaborate if there are details missing? Thank you!
Yes we largely used the same process for EPIC-Kitchens-100 as we did for EPIC-Kitchens-55. The only things we changed was how we parsed compound nouns which we added manual rules for to better extract these.
Note that the annotations for EPIC-Kitchens-55 have been re-parsed in the EPIC-Kitchens-100 dataset so there may be slight differences between the two.
Hi,
I am curious to know the exact method you used to gather nouns from narrations. I have used the following code:
However, the
place
gets incorrectly classified as a Noun. How do you overcome this?