Closed RobinQu closed 5 months ago
more complex rag pipeline may invole agent frameworks #18
OpenAI officials parameter for RAG: https://platform.openai.com/docs/assistants/tools/file-search/how-it-works
By default, the file_search tool uses the following settings: Chunk size: 800 tokens Chunk overlap: 400 tokens Embedding model: text-embedding-3-large at 256 dimensions Maximum number of chunks added to context: 20 (could be fewer)
Supported file formats: https://platform.openai.com/docs/assistants/tools/file-search/supported-files
For better evaluation result in HF QA dataset.