ooni / probe

OONI Probe network measurement tool for detecting internet censorship
https://ooni.org/install
BSD 3-Clause "New" or "Revised" License
749 stars 142 forks source link

QA: LTE "orig" HTTP failure code broken with redirects #2641

Closed bassosimone closed 4 months ago

bassosimone commented 7 months ago

With LTE "orig" analysis engine, if there's a DNS, TCP or TLS failure during a redirect, and the previous request was HTTPS, we end up saying the website is accessible. This occurs because we assume there's always a request, while LTE code only produces a request when it has a viable connections to use. I considered adding a request earlier, however, doing this would break the composability of the code primitives and would hamper switching to a DSL.

This means that a future LTE implementation using the DSL would need to jump through hoops to produce the same measurement JSON as before, or would need to break the emitted JSON compared to the current LTE. I chose to avoid doing this, in the interest of trying to keep future engineering effort under check.

It's also worth noting that v0.4 always produces an HTTP request. However, I think it's justified for LTE to break this convention for three reasons:

  1. v0.4 produces the HTTP request in all cases because it creates HTTP request first and then tries TCP, etc. while in LTE the order of operation is inverted and we only create the HTTP request at the end
  2. ooni/data is able to reconstruct the redirect chains and reach a conclusion nonetheless
  3. ooni/backend does not seem to care

Ultimately, because adding a request would have caused future engineering stress, I ended up writing the minipipeline and introduced the depth and fetch_body tags in LTE. Using the minipipeline allows us to keep the code composable while it also allows us to correctly reconstruct the redirect chains.

I opened this issue to document the bug proper and to explain my designing reasons. I also needed an issue number to reference inside the codebase of the "orig" analysis code, to be sure I remember about this bug.

bassosimone commented 4 months ago

Because I removed the "orig" engine, this issue is no longer relevant.