py-pdf / pypdf_table_extraction

A Python library to extract tabular data from PDFs
https://pypdf-table-extraction.readthedocs.io
MIT License
36 stars 13 forks source link

Add support for parsing PDF pages in parallel (multiprocessing) #17

Closed phoewass closed 6 months ago

phoewass commented 6 months ago

Closes https://github.com/py-pdf/pypdf_table_extraction/discussions/8

Parse pages in parallel using multiprocessing library leveraging all the available CPUs.

Checklist:

bosd commented 6 months ago

@foarsitter Should we go ahead and merge this?