Closed pgrandinetti closed 6 years ago
not directly but you can use the upos/xpos and morphological features to extract words of interest and use keywords_phrases
to find multi-word expressions if any sequence you like.
Once you have the multi-word compound expressions consisting of multiple tokens, you can use txt_recode_ngram
to add them to the data.frame.
See an example of this in the help of txt_recode_ngram
: ?txt_recode_ngram
FYI. You can see to the links provided in #31 for documentation of all the upos/xpos/morphological features/dependency relations which you can use for construction whichever combination of features you like.
Is there a built-in way to localize signal words in udpipe? The problem is they can be made of multiple tokens, see https://lincs.ed.gov/readingprofiles/PF_Signal_Words.htm