Closed holtchesley closed 2 years ago
On PDF, one option I looked at for CAPNow is to use a Windows box to have Word do a docx -> pdf conversion. Can be done either with a self-hosted machine or with a paid API, which both have benefits and drawbacks. This gets you a pristine PDF that perfectly matches the display in Word ... but on the other hand might not get you much that you can't get with the Word version and a two minute manual export process? Not sure.
Looking at Lulu, they request a pdf for book printing, so the automation part might be helpful there.
On PDF: it would be good if our solution produces tagged PDFs, since we are starting with reasonably well-structured HTML. So far as I know, LaTeX -> PDF still can't do this; not sure about Word for Windows -> PDF.
Oh cool, I didn't know about that. Sounds like Word would work? https://support.microsoft.com/en-us/topic/create-accessible-pdfs-064625e0-56ea-4e16-ad71-3aa33bb4b7ed
would be neat to consider a "standalone website" export as well
That sounds fun. That sounds fun both as a single html file (which we basically have already) with a <style>
tag, maybe with any images inlined, and as a WARC/WACZ, with any referenced media also captured and bundled in (if that's allowed) or if that's too fancy, an assets dir like you get with Save Page As -> Website, complete.
See also JATS, which is not for books, but, is useful for services like https://www.typefi.com/automated-publishing-solutions/industry-solutions/education/
This has moved into https://github.com/orgs/harvard-lil/projects/4
We've talked about possibly supporting:
PDF
as additional export formats, and it would be neat to consider a "standalone website" export as well.
How hard would it be to support these formats? Would either of these formats get us closer to supporting print-on-demand, or some other easily consumable format?