datagero / pico-scholar

AI-Human collaboration platform that accelerates systematic reviews by organizing academic literature, building AI tools, and expanding into new domains to enhance the global knowledge catalog.
0 stars 0 forks source link

Develop Chunking Strategies Module #18

Open datagero opened 3 days ago

datagero commented 3 days ago

Objective: Create a module for managing document chunking strategies, offering multiple chunking options. [Note, we may do this leveraging LlamaIndex]

Details: • Implement at least two chunking strategies (e.g., sentence-based and paragraph-based). • The module should allow switching between strategies dynamically. • Ensure compatibility with the embedding system.

Dependencies: Integration with embedding storage and retrieval.

Acceptance Criteria: • A reusable chunking module with multiple strategy options. • Embedding workflow compatible with selected chunking strategies.

Priority: Medium Estimated Effort: