A TypeScript sample app for the Retrieval Augmented Generation pattern running on Azure, using Azure AI Search for retrieval and Azure OpenAI and LangChain large language models (LLMs) to power ChatGPT-style and Q&A experiences.
I'm reporting this issue because I've seen the same bug in the .NET version of this sample and the logic is the same. You could reproduce this with a simple test to call DocumentProcessor.createDocumentFromFile with a file containing fewer characters than MAX_SECTION_LENGTH:
I'm reporting this issue because I've seen the same bug in the .NET version of this sample and the logic is the same. You could reproduce this with a simple test to call DocumentProcessor.createDocumentFromFile with a file containing fewer characters than MAX_SECTION_LENGTH:
https://github.com/Azure-Samples/azure-search-openai-javascript/blob/main/packages/indexer/src/lib/document-processor.ts#L16-L21
Split pages is missing a conditional return statement that yields a single page when the length is below the MAX_SECTION_LENGTH https://github.com/Azure-Samples/azure-search-openai-javascript/blob/main/packages/indexer/src/lib/document-processor.ts#L72-L78
See this change for the Python patch: https://github.com/Azure-Samples/azure-search-openai-demo/commit/e835da37aead8add52d210a7593663ce3c928229