Open royashcenazi opened 1 year ago
I think this issue can be solved using the spacy
library.
I'll take this issue.
I saw that two other pull requests (PRs) were opened for this issue. I will not be continuing work on this issue. If one of the other contributors would like to use my code, I have left it here. It does not require any additional libraries, as it only uses spacy
, which we have already imported. This code worked for me on my local machine (I was going to open a PR right now, but there is no need).
def _plural_to_singular(sig):
output_words = []
for word in sig.split():
doc = nlp(word)
for token in doc:
if token.tag_ == 'NNS': # NNS: Noun, plural
output_words.append(token.lemma_)
else:
output_words.append(word)
return ' '.join(output_words)
Currently, when parsing from a sig sentence its form, in case it is plural it will remain like this whereas it should be parsed as a singular form.
Example: "take 2 tablets of aderol every day" =>
StructuredSig(form = "tablets"...)
Possible solution: Calculate the Levinshtein distance between the parsed form to all possible outputs
{capsule, tablet, drop, syringe, lotion ...}