openzim / mindtouch

libretexts.org to ZIM scraper
GNU General Public License v3.0
0 stars 1 forks source link

Properly render Index pages of libretexts.org #67

Closed benoit74 closed 1 week ago

benoit74 commented 2 weeks ago

Fix #55

In-ZIM screenshot of page 164482 of query.libretexts.org (https://query.libretexts.org/Kiswahili/Anatomia_ya_Binadamu_(OERI)/zz%3A_Nyuma_jambo/10%3A_Index)

image

This time (contrary to glossary pages), the data is not readily available, we need to make an extra call to a special template =Template%253AMindTouch%252FIDF3%252FViews%252FTag_directory to get index content ; and this call must be made with the id of the book root page, not the current index page id. I assumed it is the first topic-category page upper in the tree of pages.

The template seems to be the same across all libretexts.org libraries (including non-english), I do not think it deserves logic to extract it from custom JS used on the page for now.

And finally, after having removed Jinja only few days ago, I finally decided it was best to add it back and use it to render glosarry/index HTML. For glossary it was feasible to live without it, but for index pages, the HTML to render is just too complex and code is unreadable without a proper engine like Jinja.

codecov[bot] commented 2 weeks ago

Codecov Report

Attention: Patch coverage is 36.00000% with 48 lines in your changes missing coverage. Please review.

Project coverage is 44.01%. Comparing base (89e0b16) to head (9ee3121). Report is 4 commits behind head on main.

Files with missing lines Patch % Lines
scraper/src/mindtouch2zim/client.py 11.11% 24 Missing :warning:
scraper/src/mindtouch2zim/libretexts/index.py 53.33% 14 Missing :warning:
scraper/src/mindtouch2zim/processor.py 16.66% 10 Missing :warning:
Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #67 +/- ## ========================================== - Coverage 44.43% 44.01% -0.42% ========================================== Files 14 15 +1 Lines 871 936 +65 Branches 116 128 +12 ========================================== + Hits 387 412 +25 - Misses 471 511 +40 Partials 13 13 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.


🚨 Try these New Features: