boan-anbo / ai-reader

MIT License
2 stars 0 forks source link

sometimes app bugs out when embedding large pdfs #4

Open azurebamboo opened 1 year ago

azurebamboo commented 1 year ago

When I embed some large documents (e.g. PDFs over 100 pages), errors will occur. The app itself does not crash but the embedding/indexing will fail.

boan-anbo commented 1 year ago

Thanks! I think I might know the cause of the issue and will release a fix soon. Sorry about that.

boan-anbo commented 1 year ago

When I embed some large documents (e.g. PDFs over 100 pages), errors will occur. The app itself does not crash but the embedding/indexing will fail.

Hi there. I thought I knew but I couldn't reproduce it as I just tested embedding 300+ pages from a book without issue.

To help me investigate, please share some more info about the characteristics of the files that fail embedding, e.g. number of pages, types of PDF (scanned and ORCed or native PDFs).

Meanwhile, in order to mitigate the issue while we investigate, I applied a hotfix to v0.1.1. It allows pages you've already indexed to be saved even when the document embedding failed and let you continue with the rest of the pages the next time you index.

Please download and reinstall v0.1.1 here to use the feature.

azurebamboo commented 1 year ago

Thank you so much! I really appreciate it. The original document is about 150 pages and contains Japanese and Korean. Another quick question: if I reinstall using the installer, will I lose the indexings I already have? Thanks!

On Sun, May 14, 2023 at 8:47 PM Bo An @.***> wrote:

When I embed some large documents (e.g. PDFs over 100 pages), errors will occur. The app itself does not crash but the embedding/indexing will fail.

Hi there. I thought I knew but I couldn't reproduce it as I just tested embedding 300+ pages from a book without issue.

To help me investigate, please share some more info about the characteristics of the files that fail embedding, e.g. number of pages, types of PDF (scanned and ORCed or native PDFs).

Meanwhile, in order to mitigate the issue while we investigate, I applied a hotfix to allow pages you've already indexed to be saved even when the document embedding failed and let you continue with the rest of the pages the next time you index.

Please download and reinstall ai-reader/releases https://github.com/boan-anbo/ai-reader/releases/tag/v0.1.1 here to use the feature.

— Reply to this email directly, view it on GitHub https://github.com/boan-anbo/ai-reader/issues/4#issuecomment-1547150454, or unsubscribe https://github.com/notifications/unsubscribe-auth/AVC75ISKCHVL4ATG7NPKDG3XGGRLLANCNFSM6AAAAAAYBNL7OA . You are receiving this because you authored the thread.Message ID: @.***>