valentinkm / MinervasMemo

A LLM based summarization tool of VTT meeting transcripts using Openai's ChatGPT 3.5 and langchain.
MIT License
0 stars 0 forks source link

Chat with past Summarized Meeting Notes #16

Open valentinkm opened 9 months ago

valentinkm commented 9 months ago

Option 1: Vectorization: Objective To build and deploy a vector database that stores summarized meeting notes, making them easily searchable and retrievable for chatbot integration.

Project Phases Phase 1: Research and Planning Technology Selection: Research FAISS, Annoy, Elasticsearch with a vector plugin. Requirements Analysis: Define database features, focusing on fast nearest-neighbor search, scalability, and high-dimensional vector support. Data Schema: Decide on the schema for storing vectors and metadata like meeting date and participants.

Phase 2: Local Development Environment Setup: Local development environment for the chosen technology. Data Ingestion: Convert summarized meeting notes into vectors. Database CRUD Operations: Implement CRUD operations. Testing: Unit tests for CRUD operations.

Phase 3: Integration GitHub Actions Workflow: Extend existing workflow to populate the vector database. API Integration: Integrate vector database API with existing Python script. Data Validation: Implement checks for data insertion.

Phase 4: Deployment Dockerization: Containerize the vector database. Cloud Deployment: Deploy to AWS, GCP, or Azure. Monitoring and Logging: Implement monitoring and logging.

Phase 5: Documentation and Review Documentation: Comprehensive setup and usage guide. Code Review: Ensure best practices. Performance Evaluation: Run benchmarks.

Phase 6: Future Enhancements Advanced Features: REST API, frontend dashboard. Machine Learning: Implement ML algorithms for clustering or classification. User Feedback: Collect feedback for improvement.

valentinkm commented 9 months ago

Alternative: Functionality to inquire about specific meetings, transcripts compressed into context window size.