UCREL / pymusas

Python Multilingual Ucrel Semantic Analysis System
https://ucrel.github.io/pymusas/
Apache License 2.0
30 stars 12 forks source link

Can I put word and POS together as input to PyMUSAS? #41

Closed karmalet closed 10 months ago

karmalet commented 10 months ago

Hello, I am not familiar with spaCy and PyMUSAS. When tagging text with PyMUSAS, I followed the code snippet below: https://ucrel.github.io/pymusas/usage/how_to/tag_text#English

>>> word = 'store' >>> output_doc = nlp(word) >>> for token in output_doc: print(f'{token.text}\t{token.lemma_}\t{token.pos_}\t{token._.pymusas_tags}') store store VERB ['A9+']

However, I would like to get USAS tagging results for the NOUN 'store', NOT the VERB 'store'. Could I give the input in the form of 'word_pos' like 'store_NOUN'? How can I implement this? Thank you so much.

perayson commented 10 months ago

Hi @karmalet, thanks for the question. Like we do for the Indonesian and Welsh pipelines (e.g. see https://ucrel.github.io/pymusas/usage/how_to/tag_text#welsh) you can use an external POS tagger to feed into the pymusas pipeline. But another important point is that you're only providing a single word out of context to the tagger. It is designed to tag running text, so you could provide a sentence with store as a noun, and see the results that way.