there is a couple of things i want the reviewers to be aware of, and a few ideas for improvements worth discussing/researching.
i have uncommented some code here. the current limit is too small for most content. remove this or change limit?
the current chat history is a simple solution. it just takes the last 4 messages and delivers that with the prompt and query, however i am sure that we can somehow use LangChain's own system for this. they do have something called ConversationSummaryBufferMemory, which could help minimize the amount of tokens in long chains.
added stream and langchain on backend to fix issue #123.
this is still a draft until i have done the following:
it has been suggested to summarize content in flow (#143), which would be a good improvement for the chatbot.