kiwix / kiwix-tools

Command line Kiwix tools: kiwix-serve, kiwix-manage, ...
https://download.kiwix.org/release/kiwix-tools/
GNU General Public License v3.0
407 stars 79 forks source link

`mailto:` links are blocked on Chrome #680

Open benoit74 opened 1 month ago

benoit74 commented 1 month ago

ZIM: https://dev.library.kiwix.org/viewer#fas-military-medicine_en_2024-05/irp.fas.org/doddir/milmed/index.html Chrome: 125 OS: MacOS Sonoma 14.5

When clicking on Steve Aftergood at the bottom of the front page, we should open a mailto: link. This does not happen.

Looks like we have a security message:

image

Is there anything which can be done or is it just Chrome who is too aggressive?

Nota: opening link outside viewer, i.e. from https://dev.library.kiwix.org/content/fas-military-medicine_en_2024-05/irp.fas.org/doddir/milmed/index.html works as expected, clearly an "iframe" issue.

Jaifroid commented 1 month ago

@benoit74 see https://github.com/kiwix/libkiwix/issues/916 and https://github.com/kiwix/libkiwix/issues/943. A fix for the mailto: issue would just be an extension of the fixes for loading external content and PDFs in readers. External content violates the sandbox and CSP in all browsers, and PDFs are considered insecure by Chromium browsers. These were fixed (at least for Kiwix Serve) by checking the link clicked by the user and opening an external window if it's https or by serving a different CSP for PDFs: a variation of what we have discussed in https://github.com/openzim/warc2zim/issues/276. I have to do the same for in KJS. I know you would prefer this to be fixed in scrapers, but not all readers have the same sandboxed CSP, and to fix at scraper level would require some rewriting of mailto: links so they are redirected to a new window before being "opened". Maybe that's easy, not sure, or it could require custom JS in each scraper.

benoit74 commented 1 month ago

This time I would probably prefer to do it in readers, for the exact reason you mention ^^ The mailto: link is valid from a ZIM / HTML specification perspective and tweaking it for proper operation is a reader specific tweak (even it could probably be fairly identical in all readers). Contrary from https://github.com/openzim/warc2zim/issues/276 where the broken links are invalid, they lead to a missing ZIM entries and scraper should hence probably take care of it on its own.

Or we have to decide to amend the ZIM specification and clearly explain that mailto: links (and probably others) have to be tweaked inside the ZIM, but this has never been discussed so far.

Glad that the way to solve the issue in kiwix-server is known. Does it means the fix is already there in KJS (because AFAIK, the mailto links are working there)?

Jaifroid commented 1 month ago

@benoit74 In the browser extension filtering for mailto: is currently only working in JQuery mode (which has some limited Zimit support) -- see screenshot. This is because in that mode there is legacy code which checks whether the protocol of the article window is the same as the protocol of the clicked link, and it shows the external link dialogue box if that's the case. However, this isn't in the (mainstream) ServiceWorker code, where filters are implemented for specific cases, due to the fact that in Zimit1 the ReplayWorker handled a lot of this. In Zimit2 we no longer have ReplayWorker filtering stuff, so will have to handle edge cases like this ourselves.

image

Jaifroid commented 1 month ago

I've just thought that it might be an easy fix to add mailto: to the Kiwix Serve and KJS sandbox Content Security Policy. Worth a try at least.

UPDATE: According to my tests in the PWA, this can indeed be solved by relaxing the CSP for the content of the iframe (at least, if not for the whole app as well). That should apply in Kiwix Serve too, which uses a very similar CSP.

@kelson42 Should an issue for modifying the CSP be opened on libkiwix instead of here?

kelson42 commented 1 month ago

@Jaifroid Yes please. If this change has any other impacts. Please list them

Jaifroid commented 1 month ago

@Jaifroid Yes please. If this change has any other impacts. Please list them

@kelson42 I opened https://github.com/kiwix/libkiwix/issues/1090. It should have no other impacts, as the fix is specific to mailto:.

kelson42 commented 1 month ago

Sorry, the issue does not being much.. I though you would do a PR.

Jaifroid commented 1 month ago

Sorry, the issue does not being much.. I though you would do a PR.

Did you mean me? I could do a "blind" PR, but I don't have the ability to test it, as I can't compile Kiwix Serve. I'm also at work...

EDIT: I would if I could, e.g. if we had a way to build with a container (?), but I've done the next best thing which is to point out exactly where the CSP entry needs to be added AFAIK.

Jaifroid commented 1 month ago

One thing that occurs to me is that nowadays there are all sorts of custom protocols. Even things like zoomus: (for a Zoom meeting), or magnet: (for a Magnet link), geo:, maps:, etc. -- anyone can define such a protocol / URI. While mailto: seems like one we should natively support via the CSP, I wonder if we need a more generic way to detect that a clicked link with such a URL needs to be opened outside the iframe / webview? In non-Zimit ZIMs, this can be achieved simply by comparing the protocol of the article's window to the protocol of the link's href, and decide its external if they don't match. But with Zimit, it's a little more complicated to do this at ZIM level due to need to add support for Wombat rewriting (and Python rewriting in Zimit2 when converting the WARC).

EDIT: I'm not sure what Wombat does with such protocols. Hopefully it leaves them alone?

kelson42 commented 1 month ago

@Jaifroid We have a shortlist in zimwriterfs of schemes we shozld not rewrite. geo: and tel: are indeed interesting cases.

benoit74 commented 1 month ago

In Zimit it is quite simple now.

Any link with a protocol has not been rewritten, because result of rewriting is always a relative link because there is no need to specify of protocol once inside the ZIM, we just want to continue to use whatever has been used to load the HTML (starting with http if ZIM is served on http, https if it is served on https, but also the hostname, the base path, ... plus we could have other readers using other protocols).

So at reader level, any relative link is de facto already rewritten, and expected to point to a resource existing inside the ZIM. If corresponding item is missing in the ZIM (for any reason) it is always replaced by an absolute link with original http or https protocol used for crawling, including hostname and base path.

Finally FYI, warc2zim rewrites only relative links and http/https absolute links. All other protocols are simply ignored and kept as-is.

From my PoV (and this is true not only for Zimit but all ZIMs), all links with a protocol specified should be able to escape the sandbox since we know for sure they do not point to something inside the ZIM. Or at least they shouldn't. Not only mailto: and tel:, any absolute link.

benoit74 commented 1 month ago

We have a shortlist in zimwriterfs of schemes we should not rewrite

Why only a shortlist? I don't get it

kelson42 commented 1 month ago

@benoit74 Sounds like you want ro make a C++ PR ;)

Jaifroid commented 1 month ago

OK, thanks, both -- apart from mailto:, geo: and tel: links in Wikivoyage ZIMs (last two already dealt with, e.g. adding the World icon for geo: and telephone icon for tel:), then readers should just compare the protocols, and offer to open a new browser window/tab in the case of any other protocol, so the browser can deal with it however it wants. I'll reference this in https://github.com/kiwix/libkiwix/issues/1090.

benoit74 commented 1 month ago

Oh, of course, each reader is welcomed to handle any scheme he knows about with custom behaviors, but generally it should fallback to opening it in the browser / system and let it decide