NaturalNode / natural

general natural language facilities for node
MIT License
10.6k stars 860 forks source link

What do you think about integrating node-snowball? #286

Open arminrosu opened 8 years ago

arminrosu commented 8 years ago

Hey all,

I need a stemmer for romanian and I found node-snowball. What do you think about integrating it into natural?

Pros

Cons

kkoch986 commented 8 years ago

well based on the number of issues we see on here re browserify it seems like theres a large number of people using natural that way so im not sure we'd be able to integrate something like that right now. Would totally lover to hear what everyone else thinks about it thought

namirsab commented 7 years ago

What about this one? https://github.com/mazko/jssnowball Is it browserifiable?

namirsab commented 7 years ago

Or allow to set a stemmer with a particular API something like:

var natural = require('natural');

natural.setStemmer('en', nodeSnowball); // or whatever

If a stemmer is just a function that recieves a word and return a stem, that could be an option that would allow everybody to use Natural both server and client side, but would also allow those users that needs this only server side to use external stemmers for different languages.

forivall commented 4 years ago

I've been looking into things, and it would be possible to fork the snowball code generator (https://github.com/snowballstem/snowball/blob/743b5af/compiler/generator_js.c#L1272) to emit modern js (or natural specific) stemmers. (personally, I also want to make an assemblyscript version)

Another option is to integrate https://github.com/MrRefactoring/multilingual-stemmer , which has the snowball stemmers compiled into wasm via Rust.