Open JasonLo opened 7 months ago
It looks like they have a haystack v2 beta available that would presumably address most or all of these issues. Looking over the docs, it's not clear if it's a straight swap or if there would be more changes involved.
Are there other comparable pipelining toolsets that could be an appropriate alternative?
On top of my head: nltk
, gensim
, and spacy
? Not sure which one is better. Let take a look together and decide. Perhaps somewhat related to the encoding problem in Elastic too...
farm-haystack
is required for data preprocessing and ingest. But this package is poorly maintained. For example:pydantic
v1We may want to replace it with better package somehow.