Open bartbutenaers opened 6 years ago
Hey Bart,
Yes, you are essentially correct with the steps required. The matcher uses the Jaro-Winkler Distance to determine the match between words. This essentially just looks at character differences and is language agnostic. Ecolect also uses the Porter Stemmer to identify the root of words to deal with variations of tense and such. The stemmer implementation Ecolect uses supports English, French and German. However there is a different implementation that supports Dutch.
I've been talking with the Ecolect developer about how to improve Ecolect and he is quite receptive to making changes. The changes to my node would be trivial.
Dean
Hey Dean,
It seems I misunderstood the language mechanism.
I thought that I simply needed, for every english file in ecolect: That I had to create a corresponding dutch file (with dutch) keywords. But that is not correct if I understand you correctly ...
So currently Ecolect can only offer the following languages at this moment ? And that the Ecolect developer would have to call another implementation (e.g. like this one) to be able to support other languages??
Didn't realize my question would require such a large change ;-(
Thanks for having (at least) a look at it !! Bart
Actually the Ecolect developer has done a very good job at making it possible to support other languages. He has built a framework that ties together other language processing tools. The talisman library he uses for English also has tools for French and German, but it is simply to add other tool libraries, such as Natural or lunr, to support other languages if they exist. Even the approach he uses for English can be changed for each language if it is not appropriate for that language.
Hey Dean,
me again :-)
What I really like about this node, is the fact that it can extract dates/numbers/times... from a phrase. For example some available possibilities for time:
However I'm speaking Dutch at home, so I would like my Ecolect node to understand my weird native language. For example instead of 'activate kitchen light in two days and four hours', I would like to interpret 'activeer keuken licht in twee dagen en vier uren'.
I see in your code that you already added some locale related stuff:
Am I right that following steps need to be executed to accomplish my goal ??
Thanks ! Bart