ooni / probe

OONI Probe network measurement tool for detecting internet censorship
https://ooni.org/install
BSD 3-Clause "New" or "Revised" License
762 stars 142 forks source link

Generic time out is marked as http-failure #2457

Open arky opened 1 year ago

arky commented 1 year ago

Describe the bug

HTTP connectivity issues causes false positive

https://explorer.ooni.org/m/20230419090111.600304_KH_webconnectivity_5796fe766a5e6460

To Reproduce

$ ./miniooni web_connectivity@v0.5 -i https://www.reddit.com
[      0.000024] <info> Current time: 2023-04-19 16:03:33 +07
[      0.000049] <info> miniooni home directory: $HOME/.miniooni
[      0.000155] <info> Looking up OONI backends; please be patient...
[      0.288520] <info> sessionresolver: https://cloudflare-dns.com/dns-query... ok
[      1.299407] <info> session: using probe services: {Address:https://api.ooni.io Type:https Front:}
[      1.299446] <info> Looking up your location; please be patient...
[      1.299518] <info> iplookup: using stun_google
[      2.001150] <info> - country: KH
[      2.001189] <info> - network: COGETEL Co., Ltd (AS23673)
[      2.001201] <info> - resolver's IP: 103.22.200.5
[      2.001212] <info> - resolver's network: Cloudflare, Inc. (AS13335)
[      2.001334] <info> [1/1] running with input: https://www.reddit.com
[      2.061450] <info> [#1] lookup www.reddit.com using 8.8.4.4:53... ok
[      2.113012] <info> [#3] lookup www.reddit.com using system... ok
[      2.225573] <info> [#2] lookup www.reddit.com using https://mozilla.cloudflare-dns.com/dns-query... ok
[      2.347183] <info> DNS whoami for 8.8.4.4:53/udp resolver: [{Address:172.217.43.129}]
[      2.568020] <info> DNS whoami for system resolver: [{Address:103.22.200.5}]
[      2.568176] <info> using resolved addrs: [{Addr:151.101.193.140 Flags:7} {Addr:151.101.1.140 Flags:7} {Addr:151.101.129.140 Flags:7} {Addr:151.101.65.140 Flags:7}]
[      2.568237] <info> prioritySelector: create with [{Addr:151.101.193.140 Flags:7} {Addr:151.101.1.140 Flags:7} {Addr:151.101.129.140 Flags:7} {Addr:151.101.65.140 Flags:7}]
[      2.630139] <info> prioritySelector: conn 151.101.65.140:443: granted permission: true
[      2.640563] <info> prioritySelector: conn 151.101.1.140:443: denied permission: timed out sending
[      2.640601] <info> [#5] GET https://www.reddit.com using 151.101.1.140:443... stop after TLS handshake
[      2.641701] <info> prioritySelector: conn 151.101.193.140:443: denied permission: timed out sending
[      2.641729] <info> [#4] GET https://www.reddit.com using 151.101.193.140:443... stop after TLS handshake
[      2.652823] <info> prioritySelector: conn 151.101.129.140:443: denied permission: timed out sending
[      2.652836] <info> [#6] GET https://www.reddit.com using 151.101.129.140:443... stop after TLS handshake
[      2.870632] <info> sessionresolver: https://cloudflare-dns.com/dns-query... ok
[      3.069186] <info> control for https://www.reddit.com using https://2.th.ooni.org... in progress
[      3.069201] <info> [#7] GET https://www.reddit.com using 151.101.65.140:443... in progress
[     12.630702] <info> [#7] GET https://www.reddit.com using 151.101.65.140:443... generic_timeout_error
[     13.679738] <info> control for https://www.reddit.com using https://2.th.ooni.org... ok
[     13.679795] <info> additional addrs discovered by the TH: []
[     13.679896] <info> DNSConsistency: consistent
[     13.679919] <warn> HTTP: unexpected failure generic_timeout_error for 151.101.65.140:443 (see #7)
[     13.679941] <warn> ANOMALY: flags=8, accessible=false, blocking=http-failure
[     13.684190] <info> submitting measurement to OONI collector; please be patient...
[     13.885657] <info> New reportID: 20230419T090346Z_webconnectivity_KH_23673_n1_vzwJEfhl6FojEjON
[     14.486504] <info> saving measurement to disk
[     14.487948] <info> experiment: recv   0.00  byte, sent   0.00  byte
[     14.488474] <info> sessionresolver: [{"URL":"https://cloudflare-dns.com/dns-query","Score":1},{"URL":"http3://cloudflare-dns.com/dns-query","Score":0.9999000010000001},{"URL":"http3://mozilla.cloudflare-dns.com/dns-query","Score":0.9999000010000001},{"URL":"https://dns.google/dns-query","Score":0},{"URL":"https://dns.quad9.net/dns-query","Score":0},{"URL":"http3://dns.google/dns-query","Score":0},{"URL":"https://mozilla.cloudflare-dns.com/dns-query","Score":0},{"URL":"system:///","Score":0}]
[     14.488649] <info> whole session: recv   7.24 kbyte, sent  44.50 kbyte

Expected behavior

A clear and concise description of what you expected to happen.

Screenshots

If applicable, add screenshots to help explain your problem.

System information (if applicable):

Additional context

Add any other context about the problem here.

bassosimone commented 9 months ago

I started working on a patch for this. The underlying conceptual problem is that Web Connectivity does not flag TLS failures as "tls" but rather uses "http-failure" also for TLS. So, the fix is to introduce the "tls" blocking value for Web Connectivity, process it into the pipeline, and handle it correctly inside OONI Explorer. I already wrote the first two paches (see above PRs) and asked for some clarifications regarding how to properly do this for Explorer.

arky commented 9 months ago

@bassosimone Thank you!

bassosimone commented 9 months ago

Moving forward with this issue without changing the pipeline to automatically flag "tls" in case of TLS failure is tricky because v0.4 measurements would report "http-failure" and v0.5 measurements would report "tls". In turn, this flapping between two values would confuse people. In trainings, we typically suggest that results that oscillate between two values are to be investigated further.

Writing a more complex patch for the existing fast path pipeline seems therefore the correct way to comprehensively fix the issue and avoid flapping. However, we are also racing with ooni/data here, and ooni/data may land before I write and test a patch for the fast path pipeline to do this. (While would be intrinsically difficult, given that the fast path pipeline has not been designed to do this, so I would need to write specific code and perform lots of testing.)

For this reason, I am removing this issue from my sprint, since it's not possible for me to complete within the end of this two week sprint (in 2.5 days), and I am moving it back to the backlog.