pdrm83 / sent2vec

How to encode sentences in a high-dimensional vector space, a.k.a., sentence embedding.
MIT License
132 stars 12 forks source link

deprecation issue with add_pipe in spacy #15

Closed jonathan-rowley closed 2 years ago

jonathan-rowley commented 2 years ago

I just installed this library to play around and noticed a deprecation between spacy v2 and v3 where the add_pipe function has been altered.

I found the answer here: https://stackoverflow.com/questions/67906945/valueerror-nlp-add-pipe-now-takes-the-string-name-of-the-registered-component-f

Ive changed a file locally (splitter.py lines 8 and 9) from:

import os
import re

import spacy

os.environ['LANGUAGE_MODEL_SPACY'] = "en_core_web_md"
nlp = spacy.load(os.environ['LANGUAGE_MODEL_SPACY'])
sentencizer = nlp.create_pipe("sentencizer")
nlp.add_pipe(sentencizer)

to:

import os
import re

import spacy

os.environ['LANGUAGE_MODEL_SPACY'] = "en_core_web_md"
nlp = spacy.load(os.environ['LANGUAGE_MODEL_SPACY'])
#sentencizer = nlp.create_pipe("sentencizer") #<-- not needed
nlp.add_pipe("sentencizer") #<-- made into a string.

This worked for me since I have v3 of spacy but im not sure how to make this work regardless of any other users in the future with different versions of spacy. Ill leave the decision to you.

pdrm83 commented 2 years ago

Thanks for your feedback. The issue had been resolved on the GitHub but was not published on the PyPi server. I just published it on the PyPi server. It would be amazing if you can tell whether the issue is resolved.

jonathan-rowley commented 2 years ago

Thanks, I just did an update and it works. I did notice in the code, though that in splitter.py line 14 the variable sentencizer is being set but not used since the deprecated line 16 is commented out. this issue is closed though.

almarengo commented 2 years ago

Hi @jonathan-rowley,

Thank you for opening issue #15 and testing/commenting on sent2vec. I am a contributor of this repository and I was wondering if you have any interest in being a collaborator/tester of sent2vec as well. I'm currently working on improving its architecture and I could use some help, especially testing, with the new version.

Thanks!

jonathan-rowley commented 2 years ago

Hi @jonathan-rowley,

Thank you for opening issue #15 and testing/commenting on sent2vec. I am a contributor of this repository and I was wondering if you have any interest in being a collaborator/tester of sent2vec as well. I'm currently working on improving its architecture and I could use some help, especially testing, with the new version.

Thanks!

@almarengo, Thanks for the offer. I am currently evaluating to see if sent2vec can solve some of my problems and I like it so far but still have some more to test. If I come across any issues that I can find then i will try and open an issue here with any possible resolutions i can come up with. for this time I guess that puts me in a testing role and if I find a good way to integrate with sent2vec I wouldnt mind adding to or improving down the road.

almarengo commented 2 years ago

For sure @jonathan-rowley, thanks for getting back. Please, keep testing it and feel free to reach out if you have any comments/suggestions on how to improve it.