ankitects / anki

Anki's shared backend and web components, and the Qt frontend
https://apps.ankiweb.net
Other
18.94k stars 2.14k forks source link

Support for system proxy settings is incomplete #1215

Closed Chaoses-Ib closed 3 years ago

Chaoses-Ib commented 3 years ago

When I set up the system proxy this way: image A network error occurs when starting Anki: image And when I use Fiddler's proxy port, I don't capture any requests from Anki. (Fiddler is a debugging proxy server tool.) But when I change the system proxy like this (prefixing "http=" to address): image Anki works fine and no bug occurs. This means that there is no problem with the proxy server itself, but rather with Anki's parsing of the system settings. And now Fiddler can capture Anki's requests to localhost (like http://127.0.0.1:10512/_anki/css/webview.css).

Debug info:

Anki 2.1.44 (b2b3275f) Python 3.8.6 Qt 5.14.2 PyQt 5.14.2 Platform: Windows 10 Flags: frz=True ao=False sv=2 Add-ons, last update check: 2021-06-01 21:41:51

===Add-ons (active)===
(add-on provided name [Add-on folder, installed at, version, is config changed])

===IDs of active AnkiWeb add-ons===

(Inactive add-ons are removed for brevity)

Chaoses-Ib commented 3 years ago

Another workaround is to set the HTTPS_PROXY (or HTTP_PROXY) environment variable:

Set HTTPS_PROXY=http://127.0.0.1:1080
"C:\Program Files\Anki\anki.exe"

Anki also works fine in this situation.

Chaoses-Ib commented 3 years ago

I monitored Anki's registry operations with Process Monitor and found that Anki read the proxy server setting four times during startup. (At HKEY_CURRENT_USER\SOFTWARE\Microsoft\Windows\CurrentVersion\Internet Settings\ProxyServer) The first two came from python38.dll, the third from rsbridge.pyd, and the last from Qt5WebEngineCore.dll.

Chaoses-Ib commented 3 years ago

I searched this repository, but found only one piece of code related to reading the proxy setting in qt/aqt/__init__.py:

    # proxy configured?
    from urllib.request import getproxies, proxy_bypass

    disable_proxies = False
    try:
        if "http" in getproxies():
            # if it's not set up to bypass localhost, we'll
            # need to disable proxies in the webviews
            if not proxy_bypass("127.0.0.1"):
                disable_proxies = True
    except UnicodeDecodeError:
        # proxy_bypass can't handle unicode in hostnames; assume we need
        # to disable proxies
        disable_proxies = True

    if disable_proxies:
        print("webview proxy use disabled")
        proxy = QNetworkProxy()
        proxy.setType(QNetworkProxy.NoProxy)
        QNetworkProxy.setApplicationProxy(proxy)

Here getproxies() is supposed to be one of the two python reads. When I set the proxy address to "127.0.0.1", getproxies() returns {'http': 'http://127.0.0.1:1080', 'https': 'https://127.0.0.1:1080', 'ftp': 'ftp://127.0.0.1:1080'}; when set the address to "http=127.0.0.1", it returns {'http': 'http://127.0.0.1:1080'}. It seems that there is nothing wrong with getproxies().

dae commented 3 years ago

Proxies are handled in the Rust backend, via https://github.com/seanmonstar/reqwest/blob/master/src/proxy.rs. If you believe it's doing something non-standard and can contribute a change upstream, Anki will be able to take advantage of the fix.

Chaoses-Ib commented 3 years ago
fn parse_registry_values_impl(
    registry_values: RegistryProxyValues,
) -> Result<SystemProxyMap, Box<dyn Error>> {
    let (proxy_enable, proxy_server) = registry_values;

    if proxy_enable == 0 {
        return Ok(HashMap::new());
    }

    let mut proxies = HashMap::new();
    if proxy_server.contains("=") {
        // per-protocol settings.
        for p in proxy_server.split(";") {
            let protocol_parts: Vec<&str> = p.split("=").collect();
            match protocol_parts.as_slice() {
                [protocol, address] => {
                    // If address doesn't specify an explicit protocol as protocol://address
                    // then default to HTTP
                    let address = if extract_type_prefix(*address).is_some() {
                        String::from(*address)
                    } else {
                        format!("http://{}", address)
                    };

                    insert_proxy(&mut proxies, *protocol, address);
                }
                _ => {
                    // Contains invalid protocol setting, just break the loop
                    // And make proxies to be empty.
                    proxies.clear();
                    break;
                }
            }
        }
    } else {
        if let Some(scheme) = extract_type_prefix(&proxy_server) {
            // Explicit protocol has been specified
            insert_proxy(&mut proxies, scheme, proxy_server.to_owned());
        } else {
            // No explicit protocol has been specified, default to HTTP
            insert_proxy(&mut proxies, "http", format!("http://{}", proxy_server));
            insert_proxy(&mut proxies, "https", format!("http://{}", proxy_server));
        }
    }
    Ok(proxies)
}

reqwest appears to handle both forms of ProxyServer properly. I thought the bug might be somewhere else, and I hooked RegQueryValueExW to make dynamic modifications to locate the bug (by Cheat Engine and IDA). But then I realized that the problematic call was indeed from rsbridge.pyd, or more precisely, get_from_registry_impl().

To figure out what was going on, I tested various scenarios: ProxyServer Anki Edge (HTTP) Edge (HTTPS)
127.0.0.1:1080 F (EOF) T T
http://127.0.0.1:1080 T T T
https://127.0.0.1:1080 F (dns error) F (err proxy) F (err proxy)
http=127.0.0.1:1080 T T F (timed out)
https=127.0.0.1:1080 F (EOF) F (connection reset) T
https=https://127.0.0.1:1080 F (EOF) F (connection reset) F (err proxy)
https=http://127.0.0.1:1080 T F (connection reset) T
http=127.0.0.1:1080;https=127.0.0.1:1080 F (EOF) T T
http=http://127.0.0.1:1080;https=127.0.0.1:1080 F (EOF) T T
http=127.0.0.1:1080;https=http://127.0.0.1:1080 T T T
HTTP_PROXY HTTPS_PROXY Anki
127.0.0.1:1080 / F (EOF)
http://127.0.0.1:1080 / T
/ 127.0.0.1:1080 F (EOF)
/ http://127.0.0.1:1080 T
127.0.0.1:1080 127.0.0.1:1080 F (EOF)
http://127.0.0.1:1080 127.0.0.1:1080 T
127.0.0.1:1080 http://127.0.0.1:1080 T

For HTTP_PROXY and HTTPS_PROXY this is fine, because the manual says you need to add the "http://" prefix. But for ProxyServer, it is very strange. Anki only supports "http=" without protocol, but not a bare address and "https=". But reqwest's code shows that this should not be the case... I have no experience with Rust development, so I can't be sure if this is a bug in reqwest.

The Windows settings panel refers to the ProxyServer as "Address" and "Port", so Anki should support at least the bare address form of the ProxyServer to ensure normal user usage. While there are many workarounds to make proxies available, it is better to solve the problem directly.

dae commented 3 years ago

Thank you for your detailed investigation. https://github.com/seanmonstar/reqwest/commit/7595dcb3f749268e1c332e40d3f4fa72c2dba3fe was added about 7 months ago; Anki 2.1.44 is using an older release of reqwest that was made before that change was introduced. Please try the latest Anki alpha version, which is using the latest reqwest.

https://betas.ankiweb.net/

Chaoses-Ib commented 3 years ago

I only checked the main branch's rslib/Cargo.toml before and found that it uses the latest version of Anki's reqwest fork, not realizing that the stable branch uses Cargo.toml from March 1...

Anyway, the Alpha version works fine and I'm going to close this issue. Thanks for your help.

dae commented 3 years ago

Thank you for alerting me to the fact that this wasn't working properly! If I see any future reports in the near future, I'll recommend people try the alpha.

weeshinwang commented 3 years ago

@dae Thanks! The latest beta works for me. Previously I resolved this issue by setting environment variable HTTP_PROXY = http://ip:port, as suggested by @Chaoses-Ib . Now the Ankiweb service works well without setting the environment variable.