ooni / explorer

OONI Explorer: uncover evidence of internet censorship worldwide
https://explorer.ooni.org
BSD 3-Clause "New" or "Revised" License
66 stars 37 forks source link

MAT shows more measurements than Search tool for the same day and country #834

Open sloncocs opened 1 year ago

sloncocs commented 1 year ago

Expected Behavior

MAT should be showing the same measurements as the search tool with the same filters.

Actual Behavior

MAT shows that there are two measurements pertaining to 'www.bbc.com' from 7th May 2018 collected from Chad, while when opened through the Search, it shows only one measurement pertaining to 'www.bbc.com' on the same day from the same country.

Screenshot 2023-01-24 at 10 54 40 Screenshot 2023-01-24 at 10 55 02
hellais commented 1 year ago

It looks like there are duplicate entries in the database which differ only by the measurement_start_time (which is off by 2h):

┌─measurement_uid──────────────────────────┬─report_id────────────────────────────────────────────────────────────────────┬─input───────────────────┬─probe_cc─┬─probe_asn─┬─test_name────────┬─────test_start_time─┬─measurement_start_time─┬─filename─┬─scores────────────────────────────────────────────────────────────────────────────────────────────────────────┬─platform─┬─anomaly─┬─confirmed─┬─msm_failure─┬─domain──────┬─software_name─────┬─software_version─┬─control_failure─┬─blocking_general─┬─is_ssl_expected─┬─page_len─┬─page_len_ratio─┬─server_cc─┬─server_asn─┬─server_as_name─┐
│ 0120180507eca3e781063a3c61cfde97aa5e62da │ 20180507T102820Z_AS327802_c4232xOzyQUDCzLMG3Cu8QRn9Jx2m39jxLtnJyIFVCIltwVSar │ http://www.bbc.com/news │ TD       │    327802 │ web_connectivity │ 2018-05-07 08:28:17 │    2018-05-07 08:29:17 │          │ {"blocking_general":0.0,"blocking_global":0.0,"blocking_country":0.0,"blocking_isp":0.0,"blocking_local":0.0} │          │ f       │ f         │ f           │ www.bbc.com │ ooniprobe-android │ 1.3.0            │                 │                0 │               0 │        0 │              0 │           │          0 │                │
└──────────────────────────────────────────┴──────────────────────────────────────────────────────────────────────────────┴─────────────────────────┴──────────┴───────────┴──────────────────┴─────────────────────┴────────────────────────┴──────────┴───────────────────────────────────────────────────────────────────────────────────────────────────────────────┴──────────┴─────────┴───────────┴─────────────┴─────────────┴───────────────────┴──────────────────┴─────────────────┴──────────────────┴─────────────────┴──────────┴────────────────┴───────────┴────────────┴────────────────┘
┌─measurement_uid──────────────────────────┬─report_id────────────────────────────────────────────────────────────────────┬─input───────────────────┬─probe_cc─┬─probe_asn─┬─test_name────────┬─────test_start_time─┬─measurement_start_time─┬─filename─┬─scores────────────────────────────────────────────────────────────────────────────────────────────────────────┬─platform─┬─anomaly─┬─confirmed─┬─msm_failure─┬─domain──────┬─software_name─────┬─software_version─┬─control_failure─┬─blocking_general─┬─is_ssl_expected─┬─page_len─┬─page_len_ratio─┬─server_cc─┬─server_asn─┬─server_as_name─┐
│ 0120180507eca3e781063a3c61cfde97aa5e62da │ 20180507T102820Z_AS327802_c4232xOzyQUDCzLMG3Cu8QRn9Jx2m39jxLtnJyIFVCIltwVSar │ http://www.bbc.com/news │ TD       │    327802 │ web_connectivity │ 2018-05-07 10:28:17 │    2018-05-07 10:29:17 │          │ {"blocking_general":0.0,"blocking_global":0.0,"blocking_country":0.0,"blocking_isp":0.0,"blocking_local":0.0} │          │ f       │ f         │ f           │ www.bbc.com │ ooniprobe-android │ 1.3.0            │                 │                0 │               0 │        0 │              0 │           │          0 │                │
└──────────────────────────────────────────┴──────────────────────────────────────────────────────────────────────────────┴─────────────────────────┴──────────┴───────────┴──────────────────┴─────────────────────┴────────────────────────┴──────────┴───────────────────────────────────────────────────────────────────────────────────────────────────────────────┴──────────┴─────────┴───────────┴─────────────┴─────────────┴───────────────────┴──────────────────┴─────────────────┴──────────────────┴─────────────────┴──────────┴────────────────┴───────────┴────────────┴────────────────┘

I suspect this is an artefact of us having in the past some issue related to the measurement_start_time not being considered in UTC, but rather in the timezone native to the pipeline (I'm now not able to find the issue, if you find it please link it).