yvann-ba / Robby-chatbot

AI chatbot 🤖 for chat with CSV, PDF, TXT files 📄 and YTB videos 🎥 | using Langchain🦜 | OpenAI | Streamlit ⚡
MIT License
766 stars 287 forks source link

Main UI doesn't see the full CSV file #27

Closed chazhenry closed 1 year ago

chazhenry commented 1 year ago

What is the difference in logic between the main chat textbox and agent processing? I loaded a ten line csv. When I ask both, the main box says the file has four rows. The agent (correctly) says ten. The main UI text box gets every query wrong. Why does it not see the full file contents?

yvann-ba commented 1 year ago

Hey, just look at the read-me, good day

chazhenry commented 1 year ago

Thanks. Still studying the code and saw the readme - just can't see where that limitation is coded. I see that create_csv_agent has a max of 4 iterations, but I don't see where the main chat limits itself to four rows. Seems that given any real world csv file, the general chat has no value and that only the agent would provide any insights into the data.

yvann-ba commented 1 year ago

This is due to the nature of the chain used, here I have improved the chain for the chatbot, you can now see in the console what the chatbot does when it searches for an answer, In fact, when using a vectorstore as a retriever, it searches the vectorstore for the chunker parts of the file that can most closely match the user's question, but it's limited because it can't give all the info from the vectorstore otherwise it exceeds the token limit, so we can define the number of indexes it can search as a parameter to as_retriever (max 9). Honestly I don't know yet how exactly the csv agent works, how it parses the whole file without going through a vectorstore but what is for sure is that it is limited to python interactions like it won't give you developed answers with a custom prompt that kind of thing it's not very modular

Le sam. 6 mai 2023 à 14:10, chazhenry @.***> a écrit :

Thanks. Still studying the code and saw the readme - just can't see where that limitation is coded. I see that create_csv_agent has a max of 4 iterations, but I don't see where the main chat limits itself to four rows. Seems that given any real world csv file, the general chat has no value and that only the agent would provide any insights into the data.

— Reply to this email directly, view it on GitHub https://github.com/yvann-hub/Robby-chatbot/issues/27#issuecomment-1537128827, or unsubscribe https://github.com/notifications/unsubscribe-auth/AXF22QXKCLG4ZYT7C6QOR5LXEY5UNANCNFSM6AAAAAAXXOZYJA . You are receiving this because you modified the open/close state.Message ID: @.***>

aiakubovich commented 1 year ago

here how to retrieve 15 rows instead of 4:

            retriever = self.vectors.as_retriever()

            retriever.search_kwargs = {'k':15}