djplaner / Content-Interface-Tweak

Improves both the task of creating content for Blackboard Learn, and reading that content.
https://djplaner.github.io/Content-Interface-Tweak/
GNU General Public License v3.0
0 stars 0 forks source link

Add ability to generate PDF (print) versions of Content Interface content #24

Closed djplaner closed 3 years ago

djplaner commented 3 years ago

Explore if Javascript librararies might provide a way to produce offline versions of Content Interface.

Javascript - appears the only options require either

  1. producing image based PDFs, not text.
  2. Having something installed on the server running in response to user actions

And/or Python libraries that might be combined with screen scraping.

Rationale

The Word documents can contain styling and embedded documents that don't print well. They are applied on the web. If we're able to use those to generate PDF/DOC the styling will be applied

Current status

Weasyprint with Python script can produce a PDF that's close, but

djplaner commented 3 years ago

WeasyPrint

  1. Install locally - Success

    pip install Weasyprint

  2. Try to print edu8702 week1

    weasyprint URL PDFFile Failure on Windows with default install

  3. Try it on Linux success
  4. Try it with some content from ContentInterface

    success somewhat. Still need to run the Javascript on it.

  5. Try it with Javascript enabled

    FAILURE The javascript is not handled by Weasyprint - at least default

Handling javascript

Suggestion is to use a Javascript pre-processor first.

Apparently PhantomJS is an option. But it isn't being maintained. The alternative is headless chrome or similar. i.e. something that works on the completed DOM after Javascript has run. e.g. Python + Selenium

Basically working. However, by default accordions can create some issues. Remove them and all ok.

What about expandAll? Could get selenium to click the expandAll button

How to show iframes/YouTube videos

These don't seem to working too well.

djplaner commented 3 years ago

Comparison of XHTML2PDF, WeasyPrint and UnoConv

Possible options

WeasyPrint

Python - source code Documentation

Seems to be able to be provided a HTML string.

XHTML2PDF

Perhaps not as complete.

UnoConv

Is based on using OpenOffice. COuld be interetsing, but heavyweight.

djplaner commented 3 years ago

To do list - immediate print versions

djplaner commented 3 years ago

Next step

Currently have a outside Python script that can generate PDFs using Weasyprint.

But the question is how to link this to course sites and keep it updated?

I could implement something, but it would require on-going work - a daily script running. But could try something with Lauren's course. See how it goes.o

Plan

  1. Configure Python script to run on Windows daily to update Word docs in COM12 sharepoint folder
  2. Configure a option in Content Interface to download PDF version

Process