quanteda / spacyr

R wrapper to spaCy NLP
http://spacyr.quanteda.io
250 stars 38 forks source link

Matching a pattern for dependencies #204

Open mitramir55 opened 3 years ago

mitramir55 commented 3 years ago

Hi everyone. Recently, I built a model that can detect passive voice in the records of a dataset using SpaCy dependency parsing, rule-based matcher, and nlp.pipe(). Here's the code for the project for more clarity.

Now, my question is, is there any way I can use Spacy matcher for this task in R using SpaCyr? I couldn't find any specific thing myself but I think there may be some ways I can construct something with a similar functionality myself.

I also could make use of n_processes in nlp.pipe() function as the dataset I'm working with is quit large. So if anyone has any idea about this too I'd love to hear it!

Thanks in advance!

kbenoit commented 3 years ago

Thanks for sharing that. We could pass through arguments to Matcher(). @amatsuo what do you think?

Also as a separate issue, using n_processes sounds like the answer to a bottleneck we've noticed for some time too: large corpora tend to be hard to process all at once.

mitramir55 commented 2 years ago

Hi @kbenoit and @amatsuo, any updates on these functions? I'm willing to contribute to expanding and building the tools if you needed any help.

kbenoit commented 2 years ago

Just picking this up - sorry it took so long - but @mitramir55 if you wanted to contribute any code for this, that would be great. spacyr needs some attention generally in light of updates to spaCy and new developments in the reticulate package.

kbenoit commented 2 years ago

We will keep this open and tag it as Wishlist.