openzim / gutenberg

Scraper for downloading the entire ebooks repository of project Gutenberg
https://download.kiwix.org/zim/gutenberg
GNU General Public License v3.0
130 stars 37 forks source link

Remove inline JavaScript to comply with some CSPs #145

Closed Jaifroid closed 1 year ago

Jaifroid commented 2 years ago

This is the same issue in Gutenberg that has been reported for PhET here: https://github.com/openzim/phet/issues/134 . To summarize, Content Security Policies of browser extensions forbid running inline JavaScript. Our ZIMs should avoid having JS inline, and should put required blocks in separate js files. For example, in the landing page of Gutenberg ZIMs, we have:

<body onload="init(); showBooks(); " class="pure-skin-gutenberg home">

This could easily be replaced with a much safer bit of code in the JS file (tools.js) where init() and showBooks() are defined. All that would be needed is something like:

document.querySelector('body').addEventListener('load', function() {
    init();
    showBooks();
});

For more info, see https://github.com/kiwix/kiwix-js/issues/789.

The effect of the current inline JS can be seen in our Chromium extension. Compare the left screenshot from the PWA version with the screenshot from the Chromium extension (both running in Service Worker mode, and both capable of running JS). The books are not shown in the extension because init() and showBooks() are blocked by the CSP.

image

eshellman commented 2 years ago

The base.html template fills the onload attribute this way: onload="init(); {% if show_books %} showBooks(); {% else %} populateFilters(); {% endif %} {%if bookshelf_home %} showBookshelfSearchResults(''); {%endif%} {%if bookshelf is defined%} showBookshelf({{ bookshelf }} ); {%endif%} so the show_books, bookshelf_home and bookshelf attributes would need to be exposed to the javascript somehow - perhaps using data- attributes?

rgaudin commented 2 years ago

Would you be able to submit a PR ?

What about:

https://github.com/openzim/gutenberg/blob/cb6a16c1e717d381a26c8593738ada45c98ca0fa/gutenbergtozim/templates/cover_article.html#L20

Also, it looks like there's an unused <script> in bookshelf.html.

Jaifroid commented 2 years ago

The base.html template fills the onload attribute this way: onload="init(); {% if show_books %} showBooks(); {% else %} populateFilters(); {% endif %} {%if bookshelf_home %} showBookshelfSearchResults(''); {%endif%} {%if bookshelf is defined%} showBookshelf({{ bookshelf }} ); {%endif%}

@eshellman Thanks for the quick reply. Are these templates that are filled at build time? Or are they processed client-side?

Jaifroid commented 2 years ago

@rgaudin That looks a bit like ReactJS to me, but it might be some other templating system... We realize it may not be possible to eliminate all inline scripting for ZIMs that rely heavily on custom UIs, especially those made with React. But if we can ensure a base level of accessibility for clients with restrictive CSPs, it would be really helpful.

rgaudin commented 2 years ago

I believe it's Jinja templates, rendered at build time. Fixing this onload seems doable. What about that link I mentioned above?

Jaifroid commented 2 years ago

What about that link I mentioned above?

This looks like the inline link that users can click (from any book cover) to see more books by the same author. It's also blocked by the Chrome Extension CSP, so again that could be a link added from within one of the attached JS files. Code needed would be to give that anchor an ID (let's say authorName), and then:

document.getElementById('authorName').addEventListener('click', function(e) {
    let author = e.target.innerHTML;
    if (author) goToAuthor(author);
});

Code untested of course!

stale[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be now be reviewed manually. Thank you for your contributions.

kelson42 commented 1 year ago

With openedx an kolibri, this is the last of this kind in all our scrapers. Would be really great to have it fixes and allow KiwixJS to fully enjoy Gutenberg ZIM files.

kelson42 commented 1 year ago

This is a priority bug in case there is a doubt somewhere. Have pinned it.

Jaifroid commented 1 year ago

Just to say that. having experimentally (and successfully) ported the Kiwix JS chromium extension to the next-generation manifest v3 format,, the CSP block on unsafe-inline scripts (and unsafe-eval) continues to be highly relevant. This issue isn't going away, and I suspect it will become more generalized as security-conscious CSPs become more widespread.