edrlab / thorium-reader

A cross platform desktop reading app, based on the Readium Desktop toolkit
https://www.edrlab.org/software/thorium-reader/
BSD 3-Clause "New" or "Revised" License
1.75k stars 153 forks source link

Thorium behind a proxy : not working #1904

Closed Welsh44 closed 4 months ago

Welsh44 commented 1 year ago

Thorium v2.2.0 on a company laptop W10/W11

We use a proxy to connect to internet.

I open a lcpl file with Thorium and the message error is :

Le téléchargement de 9782100846962.epub a échoué: [[AssertionError [ERR_ASSERTION]: http GET error: FetchError: request to https://lcp.numilog.com/publication/70858e1c-db44-4de3-ba89-9ac37b4b5c7f failed, reason: connect EACCES 81.93.2.27:443 (undefined) [https://lcp.numilog.com:443/publication/70858e1c-db44-4de3-ba89-9ac37b4b5c7f]]]

If I use the link https://lcp.numilog.com:443/publication/70858e1c-db44-4de3-ba89-9ac37b4b5c7f] in Firefox or Edge, the file is downloaded.

If I understand well the #1314 issue, it seems that Thorium don't care about a proxy : nor in the UI neither using the system environment variables.

Welsh44 commented 1 year ago

How can I use Thorium behind a company proxy ?

danielweck commented 1 year ago

Hello, we will need to create an environment where we can reproduce the behaviour you are observing. Otherwise we won't be able to track the source of the problem down.

Welsh44 commented 1 year ago

Hello I understand. Is there a way to have logs of Thorium running ?

llemeurfr commented 10 months ago

A second comment from a system administrator: one of the users has mentionned a problem with the Thorium Reader app. Whenever they try to access a book (epub file), they get an error mentionning their connection failed due to the websocket disconnecting before a TLS connection could be established, which we presume is most likely due to the fact the application is getting denied by the proxy we have in place.

In this case, the proxy uses Active Directory login information through NTLMSSP. The user setting side would be input the proxy address, port, and then username + password of the user.

Note sure "request" handles ntlm, a pure MS protocol.
Looking for node.js ntlm implementations, I stumbled on https://github.com/SamDecrock/node-http-ntlm.

atomotic commented 10 months ago

I did some experiments on using an outgoing proxy. This is what I tried:

I start a local http proxy (tried Burp / Mitmproxy, also installing their CA. Be careful, allows MITM)

Then I tried starting Thorium these ways and nothing passes through proxy:

1) ./Thorium --proxy-server=127.0.0.1:8080
2) https_proxy=127.0.0.1:8080 ./Thorium

But I learned about NODE_DEBUG that logs on stdout http traffic

NODE_DEBUG=http,http2 ./Thorium

To add more confusion: MLOL has its own desktop reader app, based on Thorium code, and with that I can run

./MlolEbookReader --proxy-server=127.0.0.1:8080

This allows me to see traffic with webviews and other private api, but not the traffic relative to LCP servers (license download, publication download, license status, etc)

danielweck commented 9 months ago

From @NachoParra

https://github.com/edrlab/thorium-reader/issues/2049

Hi,

Yesterday I started to play around Thorium so I can play licensed audiobooks from my public library, and it looks great as a default ebook reader app for the laptop.

I have a Calibre library on my NAS, which is served over HTTPS with COPS and Calibre-web.

I tried to add these two as catalogs, as well as the gutenberg opds feed and I get always the same error: image

I suspect that the problem lies on my corporate transparent MITM proxy. On my corporate laptops we have a MITM proxy that signs with it's own corporate certificate all HTTPS connections. As on W11 and Firefox the corporate certificate has been added, no problem, but no inside Thorium and it's chromium browser, so whenever I try to connect to any OPDS library, I got the error.

Is there any way to add root certificates to Thorium? If not, can we somehow tell chromium not to validate any or a list of https certicates?

Thanks!

NachoParra commented 9 months ago

Yeah, I was on the verge to post it here, but because mine is more a certificate problem (I think), I created a new ticket...

How can I see the full logs on the compiled version? Or it's better to setup a dev environment to do so?

danielweck commented 7 months ago

Possible technical solution: https://github.com/TooTallNate/proxy-agents/tree/main/packages/proxy-agent

danielweck commented 7 months ago

node-fetch is used internally in Thorium so a custom agent could be used to pass through a transparent proxy or to handle self-signed certificates: https://github.com/node-fetch/node-fetch#custom-agent Note that some LCP/LSD networking operations are performed from code outside of Thorium, using request instead ... so we would need to refactor the network stack across the different application layers in order to factor out the common agent configuration.

danielweck commented 6 months ago

Developer notes: in the latest Electron revisions (including version 29 which Thorium is now based on) there are changes / additions related to proxy managment: https://github.com/electron/electron/blob/main/docs/api/structures/proxy-config.md https://github.com/electron/electron/blob/main/docs/api/app.md#appsetproxyconfig

danielweck commented 6 months ago

Just side thoughts about "node fetch": there are well-known memory leak issues in server-side native Fetch and Unidici (which do not occur in the same way in the client-side fetch web API), even in the most recent version of the official NodeJS distribution. This is related to Garbage Collection and finalizers when HTTP request responses are not accessed (for example when only the headers are examined).

https://github.com/node-fetch/node-fetch/issues?q=is%3Aissue+is%3Aopen+leak

https://github.com/nodejs/undici/issues?q=is%3Aissue+is%3Aopen+leak

I still think the choice of the node-fetch NPM package is sensible in Thorium, alongside the cookie management layer. But when future versions of Electron (and therefore NodeJS) are released with improved memory / stream management in the "fetch" interface, it would be good to review technical debt and adopt the most reliable option. At this very moment in time though, it looks like HTTP agent usage in the node-fetch API is necessary to make Thorium work correctly behind proxy servers.

danielweck commented 6 months ago

The OpenAI folks are discussing migrating from node-fetch to Unidici with a view to ease the transition to native NodeJS Fetch when the API becomes stable (at which point Unidici could just be a lightweight shim):

https://github.com/openai/openai-node/issues/392

danielweck commented 6 months ago

Note: although global-agent does not officially support node-fetch, it apparently works fine:

https://github.com/gajus/global-agent?tab=readme-ov-file#supported-libraries

For example: (would need to handle HTTP_PROXY, HTTPS_PROXY and NO_PROXY)

import * as globalAgent from 'global-agent';
process.env["GLOBAL_AGENT_HTTP_PROXY"] = process.env.HTTP_PROXY || "http://proxy.com:1234";
globalAgent.bootstrap();
danielweck commented 6 months ago

Relevant search areas:

https://github.com/node-fetch/node-fetch/issues?q=is%3Aissue+is%3Aopen+proxy+

https://github.com/nodejs/undici/issues?q=is%3Aissue+is%3Aopen+proxy+

danielweck commented 4 months ago

Simple implementation: https://github.com/delvedor/hpagent Example usage:

https://github.com/laurent22/joplin/blob/4978a473a16ecbbee7e028fda7134975ca513e1e/packages/lib/shim-init-node.ts#L52-L60

https://github.com/laurent22/joplin/blob/4978a473a16ecbbee7e028fda7134975ca513e1e/packages/lib/shim-init-node.ts#L696-L714

danielweck commented 4 months ago

Will be fixed via https://github.com/edrlab/thorium-reader/pull/2108