openzim / libzim

Reference implementation of the ZIM specification
https://download.openzim.org/release/libzim/
GNU General Public License v2.0
166 stars 49 forks source link

[WIP] Introduce fuzzyRules storage and exploitation. #835

Closed mgautierfr closed 10 months ago

mgautierfr commented 11 months ago

Fixes #825 Fixes #826

FuzzyRule attribute follows the regex used by wabac service worker (see https://github.com/openzim/warc2zim/pull/113)

While it follows what have been done in https://github.com/kiwix/libkiwix/tree/kiwix_no_sw POC. However, the API and algorithm is not totally fixed and may involve when implementing and testing new warc2zim (without service worker).

codecov[bot] commented 11 months ago

Codecov Report

Attention: 139 lines in your changes are missing coverage. Please review.

Comparison is base (4d80e13) 57.60% compared to head (b01be3d) 56.16%.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## clone_entry #835 +/- ## =============================================== - Coverage 57.60% 56.16% -1.45% =============================================== Files 98 100 +2 Lines 4593 4786 +193 Branches 1924 2050 +126 =============================================== + Hits 2646 2688 +42 - Misses 677 777 +100 - Partials 1270 1321 +51 ``` | [Files](https://app.codecov.io/gh/openzim/libzim/pull/835?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=openzim) | Coverage Δ | | |---|---|---| | [include/zim/archive.h](https://app.codecov.io/gh/openzim/libzim/pull/835?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=openzim#diff-aW5jbHVkZS96aW0vYXJjaGl2ZS5o) | `94.82% <ø> (ø)` | | | [include/zim/writer/creator.h](https://app.codecov.io/gh/openzim/libzim/pull/835?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=openzim#diff-aW5jbHVkZS96aW0vd3JpdGVyL2NyZWF0b3IuaA==) | `100.00% <ø> (ø)` | | | [src/tools.h](https://app.codecov.io/gh/openzim/libzim/pull/835?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=openzim#diff-c3JjL3Rvb2xzLmg=) | `100.00% <ø> (ø)` | | | [src/writer/creatordata.h](https://app.codecov.io/gh/openzim/libzim/pull/835?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=openzim#diff-c3JjL3dyaXRlci9jcmVhdG9yZGF0YS5o) | `90.00% <ø> (ø)` | | | [src/fileimpl.h](https://app.codecov.io/gh/openzim/libzim/pull/835?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=openzim#diff-c3JjL2ZpbGVpbXBsLmg=) | `82.35% <0.00%> (-5.15%)` | :arrow_down: | | [src/fileimpl.cpp](https://app.codecov.io/gh/openzim/libzim/pull/835?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=openzim#diff-c3JjL2ZpbGVpbXBsLmNwcA==) | `47.17% <25.00%> (-0.78%)` | :arrow_down: | | [src/writer/creator.cpp](https://app.codecov.io/gh/openzim/libzim/pull/835?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=openzim#diff-c3JjL3dyaXRlci9jcmVhdG9yLmNwcA==) | `50.00% <0.00%> (-4.50%)` | :arrow_down: | | [src/archive.cpp](https://app.codecov.io/gh/openzim/libzim/pull/835?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=openzim#diff-c3JjL2FyY2hpdmUuY3Bw) | `46.64% <0.00%> (-2.13%)` | :arrow_down: | | [src/fuzzy\_rules.h](https://app.codecov.io/gh/openzim/libzim/pull/835?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=openzim#diff-c3JjL2Z1enp5X3J1bGVzLmg=) | `56.00% <56.00%> (ø)` | | | [src/tools.cpp](https://app.codecov.io/gh/openzim/libzim/pull/835?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=openzim#diff-c3JjL3Rvb2xzLmNwcA==) | `46.80% <0.00%> (-6.86%)` | :arrow_down: | | ... and [1 more](https://app.codecov.io/gh/openzim/libzim/pull/835?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=openzim) | |

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

kelson42 commented 10 months ago

AFAIK we don’t need this code