pressbooks / ideas

Ideas for Pressbooks.
GNU General Public License v3.0
12 stars 4 forks source link

PDF import module #1

Open greatislander opened 6 years ago

greatislander commented 6 years ago

Feature Scope

Feature Description

This PHP library, by @spatie, extracts text from PDFs: https://github.com/spatie/pdf-to-text

It could be used to build a PDF import module.

Feature Use Case

There might be circumstances where a PDF is the most readily available source format for importing openly-licensed resources into a published work. This may not be useful to that many users, but worth discussing?

Other Notes

The PHP library is MIT-licensed.

paradisojr commented 6 years ago

I would 100% like to see this feature added, as I quite frequently encounter openly-licensed content in PDF that I would like to ingest into Pressbooks.

SteelWagstaff commented 6 years ago

Agree with Jim! Ned, you might also want to know about Tabula--a similar tool used to 'liberate' datatables in PDF files (funded by the Knight and Shuttleworth foundations). Also on GitHub: https://github.com/tabulapdf/tabula with more info about the project at https://tabula.technology/

SteelWagstaff commented 5 years ago

Recently heard from Michelle Reed and colleagues using Pressbooks at UT-Arlington. This would be a highly desired feature for them as well.