Open djplaner opened 3 years ago
Question is whether we can add additional bookmarks based on headings. To do this we need to be able to read content on a page and determine if we should add a bookmark for that page
That last one has a function that parses a PDF document and generates a JSON data structure breaking the PDF up and identifying headings. Including the ones I'm after.
PyMuPDF also has functionality that generates ToC.
Different pages printing with different CSS.
Problem may be that CHrome has two separate ways to produce PDFs. Adobe and internal. NOPE
TOpic 1 - no bookmarks
TOpic 3 - bookmarks
LHS34 study guide topic 5 has a long heading. The extract headings function is getting multiple headings throwing out the ToC
Probably because long headings are spread over lines/spans. Rather than just one. Currently eaching line/space is creating a heading.
LHS34 assessment PDF is generating different font sizes which means that extract headings is having issues
Either
Different PDF files have different font sizes. But extractHeadings works on a completed document. Luckily the font sizes are uniquely strange - lots of decimal points.
This needs to be called on each individual file to make it easier to distinguish. Could even include the content for the chapter
headingFontSizes = extractChapterHeadingFontSizes( doc )
PDFs are generated with edit mode on. This adds some standard messages re: hidden, review status etc. Remove these.
Produce method to generate a single PDF of a given set of Content Interface pages
Early design
operation
In a single O365 shared folder
The python script (or eventually other)
Process
To do