CS-ISE-Project / back-end

Server repo
3 stars 1 forks source link

[Research] PDF Extraction #4

Open BrouthenKamel opened 7 months ago

BrouthenKamel commented 7 months ago

Description

Testing out different Python PDF extraction libraries

Outcome

Select PDF extraction service

BrouthenKamel commented 7 months ago

PDF Extraction library Unstructured

BrouthenKamel commented 7 months ago

Unstructured capable of partitioning a PDF detecting the following classes:

BrouthenKamel commented 7 months ago

Time comparison

Image