Closed lizadaly closed 1 year ago
Merging #1920 (ee17b31) into develop (aa052b5) will increase coverage by
0.20%
. The diff coverage is90.05%
.
@@ Coverage Diff @@
## develop #1920 +/- ##
===========================================
+ Coverage 76.82% 77.02% +0.20%
===========================================
Files 56 58 +2
Lines 6731 6773 +42
===========================================
+ Hits 5171 5217 +46
+ Misses 1560 1556 -4
Impacted Files | Coverage Δ | |
---|---|---|
web/main/celery_tasks.py | 47.36% <ø> (ø) |
|
web/tasks.py | 23.54% <64.28%> (ø) |
|
web/main/export.py | 92.56% <92.56%> (ø) |
|
web/config/celery.py | 100.00% <100.00%> (ø) |
|
web/main/models.py | 74.01% <100.00%> (-0.96%) |
:arrow_down: |
web/main/templatetags/export_node_html.py | 100.00% <100.00%> (ø) |
|
web/main/test/functional/test_platform.py | 57.81% <100.00%> (ø) |
|
web/main/test/test_export.py | 93.25% <100.00%> (+3.42%) |
:arrow_up: |
web/main/views.py | 71.49% <100.00%> (ø) |
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.
Background
For PDF export of very long books, handling annotations in the browser is prohibitively time-expensive. In earlier work I optimized this for screen display by deferring expensive transformations, but in the PDF export, all transformations need to be done up-front before the document is produced.
Luckily, we already have a server-side mechanism for introducing annotations that's known to work—the transformations that are called by the docx export pipeline that produces HTML for consumption by pandoc.
My current plan is to use the browser-friendly and accessible markup only for reading mode, and to reuse the markup generated for docx export for PDF export. I'll use PagedJS + Playwright for this because that's known to work, even if it isn't extremely fast. (I think the absolute best option for headless PDF generation in 2023 is probably https://weasyprint.org/, which unlike pandoc's PDF export is self-contained, and unlike most other open source PDF tools is still in active development.)
This actual PR
In order to support footnotes in PagedJS, the export flow will have to take an option to insert notes and links adjacent to their source rather than at the bottom of the document. But as written this export code is a little hard to work with—it's a method on a model containing an inner class containing long methods, with some complicated docstrings for testing.
Before I make any changes to it I wanted to have it in a state that was a little easier to work with, so this moves the code out to its own module and and refactors the assertions to individual pytests with smaller scopes more appropriate for unit tests. I also added type annotations to much of the code, and wrote a template tag to call the new function since it's no longer a callable method. Actual test coverage and functionality should be identical, and I verified it with a few large casebooks. I'll also test on staging once merged.