djplaner / Content-Interface-Tweak

Improves both the task of creating content for Blackboard Learn, and reading that content.
https://djplaner.github.io/Content-Interface-Tweak/
GNU General Public License v3.0
0 stars 0 forks source link

Add the ability to download the content (print documents) #19

Closed djplaner closed 3 years ago

djplaner commented 3 years ago

Explore if Javascript librararies might provide a way to produce offline versions of Content Interface.

And/or Python libraries that might be combined with screen scraping.

Rationale

The Word documents can contain styling and embedded documents that don't print well. They are applied on the web. If we're able to use those to generate PDF/DOC the styling will be applied

Current status

Weasyprint with Python script can produce a PDF that's close, but

djplaner commented 3 years ago

Comparison of XHTML2PDF, WeasyPrint and UnoConv

Possible options

WeasyPrint

Python - source code Documentation

Seems to be able to be provided a HTML string.

XHTML2PDF

Perhaps not as complete.

UnoConv

Is based on using OpenOffice. COuld be interetsing, but heavyweight.

djplaner commented 3 years ago

WeasyPrint

  1. Install locally - Success

    pip install Weasyprint

  2. Try to print edu8702 week1

    weasyprint URL PDFFile Failure on Windows with default install

  3. Try it on Linux success
  4. Try it with some content from ContentInterface

    success somewhat. Still need to run the Javascript on it.

  5. Try it with Javascript enabled

    FAILURE The javascript is not handled by Weasyprint - at least default

Handling javascript

Suggestion is to use a Javascript pre-processor first.

Apparently PhantomJS is an option. But it isn't being maintained. The alternative is headless chrome or similar. i.e. something that works on the completed DOM after Javascript has run. e.g. Python + Selenium

Basically working. However, by default accordions can create some issues. Remove them and all ok.

What about expandAll? Could get selenium to click the expandAll button

How to show iframes/YouTube videos

These don't seem to working too well.