Open FedericoCeratto opened 4 years ago
Status update: a prototype implemented on CatBoost fetches data from Clickhouse. It learns to predict the value of the "status" column from the columns: "report_id, input, probe_cc, probe_asn, test_name, platform, control_failure, is_ssl_expected, page_len, page_len_ratio, server_cc, server_asn, server_as_name"
It then run predictions and sorts the output by certainty and shows the ones where ML and the fastpath disagree or where ML is less certain. It seems to easily spot broken tests, bugs in the msmt scoring and cases where the scoring is not smart enough.
Attempts at manually handling fingerprints is going to become less effective as the amount and diversity of measurements increases.