neuml / txtai

💡 All-in-one open-source embeddings database for semantic search, LLM orchestration and language model workflows
https://neuml.github.io/txtai
Apache License 2.0
8.7k stars 575 forks source link

New to txtai, some general questions #700

Closed yiouyou closed 3 months ago

yiouyou commented 4 months ago

I'm familiar with Langchain's RAG process. When I first encountered Txtai, however, I was a bit uneasy. As the author mentioned, not all steps require LLM participation, and many models can achieve good results. I strongly agree with the author's idea. And moreover, the Txtai is the only tool I've found that supports graph RAG out of the box, which is why I'm a big fan of the project and would love to integrate it with my Langchain applications.

As a non-expert programmer, I often rely on examples. From my limited experience, I don't have a clear understanding of how to integrate Txtai and Langchain. I think Txtai can replace the embedding and retriever functions. I wonder if the author has any examples to share.

I've also discovered Txtchat, and I think the two work seamlessly together. However, I feel that Txtai lacks a graphical backend for managing embeddings. Txtai uses Faiss to store vectors. Is there a recommended graphical tool for Faiss?

Finally, once a specific knowledge base is ingested into Txtai, how can I migrate this database to other Txtai projects?

In sum: 1) how to connect the txtai with langchain? 2) is there a gui for Faiss database? 3) where is the saved persistant embedding?

Thanks,

davidmezzetti commented 4 months ago

Hello, thank you for the interest. I've also noticed your txtchat issues and I'll take a look at those.

In terms of these questions.

  1. how to connect the txtai with langchain? I haven't considered this integration. I'm sure there are ways to do it but I don't use langchain, so I'm not that familiar with the framework.

  2. is there a gui for Faiss database? Not that I'm aware of. It's a file system format. If you wanted to inspect the data in Faiss, the best thing would be doing that with a Jupyter notebook. See the link below. https://github.com/neuml/txtai/blob/master/examples/29_Anatomy_of_a_txtai_index.ipynb

  3. where is the saved persistant embedding? Seems related to (2). This depends on the backend used. For Faiss, there are methods to reconstruct embeddings but they aren't stored by default.

yiouyou commented 4 months ago

@davidmezzetti

Thanks. I'm learning txtai from all those notebook examples. Once get ready, maybe can write a notebook about ollama + txtai + langchain.

I tried to find more tutorials on youtube, but not that much and helpful. I've not run a txtchat workflow with rocket.chat successfully, the document seems not enough for a newbie.

davidmezzetti commented 3 months ago

Closing due to inactivity. Please re-open or open a new issue if there are further questions.