Stirling-Tools / Stirling-PDF

#1 Locally hosted web application that allows you to perform various operations on PDF files
MIT License
41.77k stars 3.32k forks source link

[Feature Request]: Split PDF by chapter #1592

Open pepijnolivier opened 1 month ago

pepijnolivier commented 1 month ago

Feature Description

Why is this feature valuable?

This could be useful for many purposes:

Suggested Implementation

Additional Information

To be tested on huge and official documents

No Duplicate of the Feature

sbplat commented 2 weeks ago

I think it would be more suitable for this to be 2 separate steps. First, extract the page numbers from the toc and then split it using "Split PDF". For extracting the page numbers, maybe we could have a feature that runs a regex on the text of some page number(s), and outputs that. Could include some common expressions as well to make it easier.

Rudra-241 commented 1 week ago

For PDFs with predefined outlines, check this draft: https://github.com/Stirling-Tools/Stirling-PDF/pull/1786