Closed benoit74 closed 1 week ago
Attention: Patch coverage is 68.51852%
with 34 lines
in your changes missing coverage. Please review.
Project coverage is 51.48%. Comparing base (
733c35a
) to head (4749161
). Report is 2 commits behind head on main.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
How much of the CSS Processor is based on warc2zim? Once this has matured enough, I think it should move to scraperlib.
This is my plan indeed. So far it is a bit of copy-and-paste indeed, but with modifications due to the specificities of libretexts. The same will happen with HTML rewriting which will be needed for images, videos, links, ... Both features are indeed not specific to warc2zim at all and primitives should definitely be shared in python-scraperlib, I'm sure we will need them again in other scrapers.
This is part of #8
This first step takes care of CSS stylesheets which are in an external file (two indeed, one for screen and one for print).
It handles fetching all assets (images, fonts) referenced in the CSS.
It handles rewriting of the CSS to fix URLs.
It does not consider inline CSS which is needed and will be handled in a step 2.