With LTE "orig" analysis engine, if there's a DNS, TCP or TLS failure during a redirect, and the previous request was HTTPS, we end up saying the website is accessible. This occurs because we assume there's always a request, while LTE code only produces a request when it has a viable connections to use. I considered adding a request earlier, however, doing this would break the composability of the code primitives and would hamper switching to a DSL.
This means that a future LTE implementation using the DSL would need to jump through hoops to produce the same measurement JSON as before, or would need to break the emitted JSON compared to the current LTE. I chose to avoid doing this, in the interest of trying to keep future engineering effort under check.
It's also worth noting that v0.4 always produces an HTTP request. However, I think it's justified for LTE to break this convention for three reasons:
v0.4 produces the HTTP request in all cases because it creates HTTP request first and then tries TCP, etc. while in LTE the order of operation is inverted and we only create the HTTP request at the end
ooni/data is able to reconstruct the redirect chains and reach a conclusion nonetheless
ooni/backend does not seem to care
Ultimately, because adding a request would have caused future engineering stress, I ended up writing the minipipeline and introduced the depth and fetch_body tags in LTE. Using the minipipeline allows us to keep the code composable while it also allows us to correctly reconstruct the redirect chains.
I opened this issue to document the bug proper and to explain my designing reasons. I also needed an issue number to reference inside the codebase of the "orig" analysis code, to be sure I remember about this bug.
With LTE "orig" analysis engine, if there's a DNS, TCP or TLS failure during a redirect, and the previous request was HTTPS, we end up saying the website is accessible. This occurs because we assume there's always a request, while LTE code only produces a request when it has a viable connections to use. I considered adding a request earlier, however, doing this would break the composability of the code primitives and would hamper switching to a DSL.
This means that a future LTE implementation using the DSL would need to jump through hoops to produce the same measurement JSON as before, or would need to break the emitted JSON compared to the current LTE. I chose to avoid doing this, in the interest of trying to keep future engineering effort under check.
It's also worth noting that v0.4 always produces an HTTP request. However, I think it's justified for LTE to break this convention for three reasons:
Ultimately, because adding a request would have caused future engineering stress, I ended up writing the minipipeline and introduced the depth and fetch_body tags in LTE. Using the minipipeline allows us to keep the code composable while it also allows us to correctly reconstruct the redirect chains.
I opened this issue to document the bug proper and to explain my designing reasons. I also needed an issue number to reference inside the codebase of the "orig" analysis code, to be sure I remember about this bug.