kiwix / libkiwix

Common code base for all Kiwix ports
https://download.kiwix.org/release/libkiwix/
GNU General Public License v3.0
112 stars 55 forks source link

Translation of error pages #1032

Closed veloman-yunkan closed 5 months ago

veloman-yunkan commented 7 months ago

Will fix #1019

codecov[bot] commented 7 months ago

Codecov Report

Attention: 77 lines in your changes are missing coverage. Please review.

Comparison is base (9c5f5c7) 38.95% compared to head (dc3960c) 39.34%.

Files Patch % Lines
src/server/response.cpp 50.44% 4 Missing and 52 partials :warning:
src/tools/otherTools.cpp 15.38% 1 Missing and 10 partials :warning:
src/server/internalServer.cpp 44.44% 0 Missing and 5 partials :warning:
src/server/request_context.cpp 70.00% 0 Missing and 3 partials :warning:
src/server/i18n.cpp 50.00% 0 Missing and 2 partials :warning:
Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #1032 +/- ## ========================================== + Coverage 38.95% 39.34% +0.39% ========================================== Files 58 58 Lines 3958 4061 +103 Branches 2181 2237 +56 ========================================== + Hits 1542 1598 +56 - Misses 1086 1091 +5 - Partials 1330 1372 +42 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

veloman-yunkan commented 7 months ago

@mgautierfr This PR is still work in progress, however I would like to learn your feedback regarding the small enhancement that I made in mustache.hpp in order to reduce the amount of extra coding that I would need to do without it.

Conceptually, the backend part is complete (I only have to mark those error pages that should be dynamically-translatable and have KIWIX_RESPONSE_DATA in them). The frontend will only have to take advantage of the response parameter data included in the error pages and update the error page text accordingly.

kelson42 commented 7 months ago

@mgautierfr We need a feedback here

mgautierfr commented 7 months ago

I'm a bit puzzled with the idea to add new specific api feature to mustache. It would need that we ask packagers to use a patched mustache library and it would be probably (and legitimately) refused. We should at least propose a patch upstream (and make it accepted) before we go with this.

Why not using lambda (https://github.com/kiwix/libkiwix/issues/751#event-6459679079) ? It would make us:

Is there any traps I haven't seen ?


But I really like the separation of the response from the connection. It is a step in the direction of #740 (even if this may not be the purpose)

kelson42 commented 6 months ago

We definitly should rely on unmodified versions of external dependencies... but indeed, we can do upstream PR.

veloman-yunkan commented 6 months ago

Why not using lambda (#751 (comment)) ?

@mgautierfr

Generation of the error page content is a two step process.

There is a shared template for error pages with parameters such as the page title & heading. This is the mandatory minimum needed to build an error page. However, optionally, a list of additional detail lines can be provided.

  1. The error details are generated in C++ code by using parameterized message facilities. Currently (outside this PR) the translation of this additional info happens eagerly in the backend and the translated/instantiated string goes into step 2.
  2. The HTML of the error page is produced by instantiating a mustache template with values for the page title, heading and details text.

Consider it on an example:

    return HTTP404Response(request)
           + noSuchBookErrorMsg(bookName);

First, noSuchBookErrorMsg(bookName) is executed and instantiates a i18n message template for the given book name. Then the resulting std::string (with any explicit information about the book name erased from it) participates in another level of mustache template instantiation. The idea of this PR was to preserve all the data used to generate the response HTML and embed that data in the response so that it can be rerendered in the front-end.

This two-step nature of error response generation was missing from the HTTPErrorResponse unit tests but now I added a commit illustrating it.

When I started writing this reply I didn't at all see how lambdas could be used to address that problem. Now I can sense some solution relying on lambdas, however it seems artificial and limited (it won't handle additional levels involved in making up the final text; suppose that the bookName parameter in the example above is not a terminal string, but some text obtained as a result of translating a parameterized message: bookName = getBookNameMsg(book, name) ).

I will think further, but will appreciate if you hint at the solution that you had in mind if it really is capable of addressing the described challenge.

veloman-yunkan commented 6 months ago

I also implemented the translation of the error page in the front-end. In the worst case (if the modification to mustache.hpp is not accepted) I can switch our ParameterizedMessage from kainjow::mustache::data to a relatively small class implemented in kiwix code.

veloman-yunkan commented 6 months ago

I completed the draft version of the solution that doesn't need mustache.hpp to be patched. I will clean up the PR tomorrow.

veloman-yunkan commented 6 months ago

Clean-up is complete, yet there is still some work to be done on this PR. I will move forward once @mgautierfr approves the current solution. Note that to test it, you must use the last-but-one commit ("Demo of error page translation") - the extra commit was added to prove that this PR doesn't utterly break the CI.

veloman-yunkan commented 6 months ago

The PR is now functionally ready, however some changes may need to be made depending on the answers to the following questions:

  1. Is sending the template data with every response that may need to be translated in the frontend acceptable? That was a shortcut intended to achieve the desired functionality with the least effort and will also simplify fixing #1028.
  2. Is it OK that translation data is also embedded into error responses that are not currently expected to be displayed by the viewer (e.g. responses to invalid requests to /catalog, /raw, /suggest, etc, endpoints)?
kelson42 commented 5 months ago

@mgautierfr Review please?

mgautierfr commented 5 months ago

Is sending the template data with every response that may need to be translated in the frontend acceptable? That was a shortcut intended to achieve the desired functionality with the least effort and will also simplify fixing https://github.com/kiwix/libkiwix/issues/1028.

I think we are good on this. We may in the future introduce a endpoint error_template to get the template from an error code (and giving this error code in the response) but for now, we can go without it. Template is a not of data, and it send only for error.

Is it OK that translation data is also embedded into error responses that are not currently expected to be displayed by the viewer (e.g. responses to invalid requests to /catalog, /raw, /suggest, etc, endpoints)?

Yes also here. This double the error response size, but error response are pretty small and it is only for error, so pretty rare.

Also, the test cases against vulnerabilities in kiwix-serve seem to suggest that KIWIX_RESPONSE_DATA should be HTML-encoded too.

I totally agree.

veloman-yunkan commented 5 months ago

Also, the test cases against vulnerabilities in kiwix-serve seem to suggest that KIWIX_RESPONSE_DATA should be HTML-encoded too.

I totally agree.

@mgautierfr You better had objected in this case :smiley: . I figured out that the issue is rather an edge case and provided a simpler fix.

mgautierfr commented 5 months ago

@mgautierfr You better had objected in this case 😃 . I figured out that the issue is rather an edge case and provided a simpler fix.

I had missed that, your comment opens my eyes.


We are good (I've just rebase-fixup)

veloman-yunkan commented 5 months ago

(I've just rebase-fixup)

Thank you. Note, however, that one of the fixup commits implied (and that was stated in its commit message) that the description of the commit-to-be-amended had to be corrected too. I have now added that information as a commit comment.