openderock / extract-lemmatized-nonstop-words

Extracts a pure list of stemmed words of a text filtered by stop words
GNU Affero General Public License v3.0
5 stars 1 forks source link

Lemmatizer version #1

Closed Berkmann18 closed 5 years ago

Berkmann18 commented 5 years ago

Is there a reason as to why the extract-lemmatize-nonstop-words package was completely removed from NPM and GH? It was the best package I've found and it's gone now.

ziaenezhad commented 5 years ago

@Berkmann18 I just renamed that.

Berkmann18 commented 5 years ago

@sajjad-shirazy Why? stemmers and lemmatisers are two different things. It's also misleading to name a lemmatiser a stemmer (unless you've changed it to a stemmer).

ziaenezhad commented 5 years ago

@Berkmann18 Right, it's the cause I renamed it. it was stemming not lemmatizing. my goal was collecting vocabularies of a text and lemmatizing shoots some words out. eg. using lemmatizing I will lose better because it will transfer to good while I need better too.

Berkmann18 commented 5 years ago

@sajjad-shirazy Ah so it wasn't even a lemmatizer when it had that name but a stemmer since the start.

ziaenezhad commented 5 years ago

@Berkmann18 check out: https://github.com/winkjs/wink-lemmatizer