Adding long term vectordb memory to alwrity

AJaySi commented 2 months ago

New Alwrity Feature discussion:

1). Alwrity needs to remember the content and its metadata to not rewrite the same articles again. Usecase: Each web research can be embedded and stored for alwrity to train itself and find semantically correct answers. This will help cutoff cost, as we can check local knowledge before calling APIs.

2). This will make alwrity aware of each user preference and also RAG-tune based on their domain. Making alwrity domain expert for each content type, every user will use it for. Usecase: If Mr. X uses alwrity to write on marijuana, then alwrity will smoke/store each keywords web research and train itself on it. With each content generation, alwrity will also become domain expert on it.

3). Privacy & Security : Keep your data, Alwrity ain't interested. Your web searches, content reside at your local disk. The database will also reside on your laptop.

4). It will be interesting to see if alwrity can provide internal linking while content generation.

Exploring : https://github.com/lancedb/lancedb

AJaySi commented 2 months ago

LanceDB is awesome. For a fully privacy centric solution, we will need lancedb. Although, a lot baking in would be required to achieve the following:

CrewAI memory has all the features Alwrity needs. Although, it is expensive as agents are chatty and call apis too often. Giving Alwrity will mitigate this, as it will rely on local knowledge before calling APIs. The idea is: As a user keeps creating articles/content/web searches, Alwrity will learn based on previous iterations. This needs to be verified.

https://docs.crewai.com/core-concepts/Memory/ The crewAI framework introduces a sophisticated memory system designed to significantly enhance the capabilities of AI agents. This system comprises short-term memory, long-term memory, entity memory, and newly identified contextual memory, each serving a unique purpose in aiding agents to remember, reason, and learn from past interactions.

AJaySi commented 2 months ago

1). CrewAI is expensive, for its chatty. 2). It fails a lot. embedding is expensive with cloud providers. I hit ratelimits and gemini gave errors: openai.RateLimitError: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}} langchain_google_genai.chat_models.ChatGoogleGenerativeAIError: Invalid argument provided to Gemini: 400 Developer instruction is not enabled for models/gemini-pro

People dont appreciate but, alwrity hardly ever fails. Not so sure of using crewai, if alwrity errors out on behalf of Alwrity.

AJaySi commented 1 month ago

@umesh070 ; Check comments above. I am closing this and using memory from RAG & GPT-Researcher- llamaindex

AJaySi / AI-Writer

Adding long term vectordb memory to alwrity #50