Open tuxology opened 10 years ago
I strongly agree with this one. I'd also like to say, it would be better if we could use external files (and, ideally, a different semantic) to catch questions rather thank store them into regex declared into the code. It would very helpful to easily extend the system, and potentially, everyone could contribute to it.
I agree regexes get messy way too quickly. What about using a tokenizer? See also: https://github.com/pickhardt/betty/issues/74 NLP seems like overkill at this point.
Any news about NPL in this project?
IMHO, if someone is serious about this then eventually there would be a need to use things like Stanford NLP or OpenNLP toolkits. Have a look at Treat also - https://github.com/louismullie/treat
We would have to move out from regex domain and into the real world of conversations pretty soon to handle growing number of permutations and combinations in inputs. I think it would be a good effort to have someone assigned to design basic tests for using NLP with betty