Closed AriaShishegaran closed 2 months ago
Unstructured supports epub. I'll add that today to R2R too. AZW3 seems like an older amazon kindle extension (mobi is current one). I think supporting that will take slightly longer. Perhaps you can try using an online converter for AZW3 until then.
Fixed in this: https://github.com/SciPhi-AI/R2R/pull/1157 Will merge into main today.
Is your feature request related to a problem? Please describe. I'd like to request the ability for this system to ingest more book formats for the RAG solution such as EPUB and AZW3. This would enable a more diverse set of widely accepted book formats to work with the system but also for users who don't have the PDF version or simply there's no PDF version of the said book.
Describe the solution you'd like Added support for more e-book formats.
Describe alternatives you've considered Well, LLamaParse/Index supports this and EPUBs are naturally easier to parse and understand since they are just repacked HTMLs. I assume this should be a rather robust problem to solve.