ooni / data

OONI Data CLI and Pipeline v5
https://docs.ooni.org/data
8 stars 4 forks source link

Fix bug in reading TLS network_events #34

Closed hellais closed 2 months ago

hellais commented 1 year ago

Web Connectivity 0.5 has changed the way in which network_events are sorted. You no longer have the guarantee that all events related to the same transaction are together, which was a wrong assumption made in how the network events related to a particular TLS handshake was re-contructed in find_tls_handshake_network_events (https://github.com/ooni/data/blob/main/oonidata/transforms/nettests/measurement_transformer.py#L294).

It does however have a transaction_id which is actually a much cleaner way of figuring out what network events are related to a particular transaction.

We should refactor the find_tls_handshake_network_events to prefer using the transaction_id based lookup and failover to using the older method only when transaction_id is missing.

Here are some sample measurements to use as test cases: Web Connectivity 0.5: https://explorer.ooni.org/m/20230817084207.519697_JO_webconnectivity_b62797f5aef7a7d6

Currently leads to erroneously flagging the last operation as write_2, while it should instead be write_1.

Sample measurement without transaction_id: https://explorer.ooni.org/m/20230901130157.590570_US_webconnectivity_b5534b1c5b446826