AiDAPT-A / VisArchPy

pipelines for the extraction and processing of visuals from PDFs
https://visarchpy.readthedocs.io
MIT License
3 stars 1 forks source link

Attempt to extract SVGs #26

Open manuGil opened 1 year ago

manuGil commented 1 year ago

PDF miner has no native type for handling SVG elements. Instead, SVG elements in a document are collected using types like LTCurve, LTLine, LTBox, etc. Extracting these type of elements and convert them into SVG files might not be possible.

manuGil commented 1 year ago

This is an alternative using Inkspcape and Python, but it probably cannot distinguish between vectors and images: https://gist.github.com/vitchyr/4894861de3fa8d4ffcba