Open abaranovskis-redsamurai opened 10 months ago
Hi Andrej, Have you got answer to your above question? By the way your youTube video on Donut is really good.
Thanks, Sanjay
hey Sanjay. Nope, there was no answer. What I did - converted two pages PDF into a single image. This way it worked.
Thanks for your feedback about Donut related video :)
Andrej
Thanks for quick reply. Do you mean create one long image?
yes, correct.
@abaranovskis-redsamurai ....if suppose i have a PDF consisting 10 pages from which i need to parse data in continuation to maintain the hierarchy of headings and its points/sub points that continue on the next pages....what about the config file parameters changes like max_length, input_size etc. ?
Thanks in advance!!
I have trained successfully with up to 10 pages with default value of max_length. You can calculate max_length using number of keys you have. Size is based on your document size so change accordingly.
I have trained successfully with up to 10 pages with default value of max_length. You can calculate max_length using number of keys you have. Size is based on your document size so change accordingly.
do you mean 10 merged into a single image? how did you annotate it? because i tried annotating it with label studio and it throws unresponsive error because the image is too long, my ML backend is not working properly because of it.
Hello,
Examples for Donut are based on single-page docs (invoices, receipts, etc.). How well would it work with multipage docs? For instance, if the number of invoice items is large and the rest of the invoice goes to the second page. Would extract data from the second page work out of the box?
Thanks.