google / mdbook-i18n-helpers

Translation support for mdbook. The plugins here give you a structured way to maintain a translated book.
Apache License 2.0
121 stars 25 forks source link

Keep Inline HTML tags in the translated text group while ignoring block level HTML tags #195

Closed michael-kerscher closed 1 month ago

michael-kerscher commented 2 months ago

Fixes #125 by keeping inline HTML tags in the translation text group but skip block level HTML tags. This is possible with the upgrade of pulldown-cmark in #183

codecov-commenter commented 1 month ago

Codecov Report

Attention: Patch coverage is 85.56701% with 14 lines in your changes are missing coverage. Please review.

Project coverage is 90.95%. Comparing base (2b14491) to head (bb97df5). Report is 2 commits behind head on main.

Files Patch % Lines
i18n-helpers/src/lib.rs 76.27% 10 Missing and 4 partials :warning:
Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #195 +/- ## ========================================== + Coverage 90.27% 90.95% +0.67% ========================================== Files 12 12 Lines 3034 3085 +51 Branches 3034 3085 +51 ========================================== + Hits 2739 2806 +67 + Misses 207 186 -21 - Partials 88 93 +5 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

mgeisler commented 1 month ago

Thanks @michael-kerscher for putting this up! Based on the test changes, it looks great.

@dyoo and @kdarkhan, I would love to have one of you look at this since you've both been working with this code recently.

kdarkhan commented 1 month ago

The fuzz step is failing due to some changes in rust nightly. The failure seems to occur from one of the dependencies.

Until the dependency is fixed, you can temporarily pin to specific nightly version by changing this line to rustup default nightly-2024-05-10.

kdarkhan commented 1 month ago

The rules that determine whether the HTML is block or inline surprised me. I naively assumed that if a piece of HTML is on a separate line, then it is a block, and inline HTML otherwise. Turns out I was wrong. The rules are defined in https://spec.commonmark.org/0.31.2/#html-blocks. Some tags like p or div in a single line are considered to be blocks, while span is an inline.

Here is an example which demonstrate how they differ. Switch to AST tab to view inlines vs blocks.

I believe even with above rules, not splitting inline elements into separate groups is an improvement.