Open pranshuchaurasia opened 5 months ago
Just wanted to also note that I faced issues with not all files converting. I had a folder with 2000+ files, and the output was missing roughly 10. I extracted those 10 missing files and tried to run the batch on them, but the same outcome didn't convert. So, there must be some issue with handling the files themselves. I can send those files if helpful.
I get incomplete conversions. When I use single file conversion, it captures all the texts and tables. However, when I try multiple file conversions on similar PDFs, I get partial texts in the markdown and no tables. I get something like this in the terminal.
Issue Summary: When attempting to batch convert multiple PDF files using the marker command, not all files in the specified directory are processed. Specifically, when the directory contains 20 PDF files, only 15 are converted, despite using appropriate flags to handle multiple files. I tried with different number of pdf same, the result was the same.
Command Used: marker /path/to/input/folder /path/to/output/folder --workers 10
Expected Behavior: All specified PDF files should be processed and converted to markdown when --max command was not specified.
Actual Behavior: Only 15 out of 20 PDF files are processed and converted. The remaining 5 files are only successfully converted when processed individually rather than as part of the batch.
Additional Information: (1) No error messages are output by marker when the issue occurs. (2) Individual processing of each of the 5 unconverted files succeeds with no issues. (3) This behavior is consistent across multiple attempts with different sets of PDF files.