superagent-ai / super-rag

Super performant RAG pipelines for AI apps. Summarization, Retrieve/Rerank and Code Interpreters in one simple API.
https://docs.superagent.sh
MIT License
341 stars 51 forks source link

Empty chunks when using semantic splitter #84

Closed elisalimli closed 8 months ago

elisalimli commented 8 months ago

I am getting empty chunks when I try to use semantic splitter. I guess this is caused by we are not handling the text types.

https://github.com/superagent-ai/super-rag/blob/main/service/splitter.py#L127

Example PDF File: https://www.buds.com.ua/images/Lorem_ipsum.pdf

elisalimli commented 8 months ago

@simjak do you mind looking at this?

simjak commented 8 months ago

Thanks @elisalimli I will try to reproduce the error and patch it