Open greatislander opened 6 years ago
I would 100% like to see this feature added, as I quite frequently encounter openly-licensed content in PDF that I would like to ingest into Pressbooks.
Agree with Jim! Ned, you might also want to know about Tabula--a similar tool used to 'liberate' datatables in PDF files (funded by the Knight and Shuttleworth foundations). Also on GitHub: https://github.com/tabulapdf/tabula with more info about the project at https://tabula.technology/
Recently heard from Michelle Reed and colleagues using Pressbooks at UT-Arlington. This would be a highly desired feature for them as well.
Feature Scope
Feature Description
This PHP library, by @spatie, extracts text from PDFs: https://github.com/spatie/pdf-to-text
It could be used to build a PDF import module.
Feature Use Case
There might be circumstances where a PDF is the most readily available source format for importing openly-licensed resources into a published work. This may not be useful to that many users, but worth discussing?
Other Notes
The PHP library is MIT-licensed.