HSLdevcom / transitdata

transitdata publishes GTFS Realtime interfaces
European Union Public License 1.2
10 stars 3 forks source link

Add random noise to passenger count data #274

Open mjaakko opened 1 year ago

mjaakko commented 1 year ago

Add random noise to passenger count data to better preserve passenger privacy. Currently passenger count data is published as an enum value (EMPTY, FEW_SEATS_AVAILABLE, etc.). When the value changes, it is possible to detect individual passengers boarding or alighting the vehicle. Add random noise (e.g. ±5%) to the raw vehicle load ratio and calculate the enum value from this.

This change should be done so that Reittiloki will display the same passenger count values

mjaakko commented 1 year ago

Some work was already done: https://github.com/HSLdevcom/transitdata-common/tree/feat/randomized_passenger_count https://github.com/HSLdevcom/transitdata-hfp-parser/tree/feat/randomized_load_ratio https://github.com/HSLdevcom/transitlog-apc-sink/tree/feat/randomized_load_ratio https://github.com/HSLdevcom/transitdata-vehicleposition-processor/tree/feat/random_passenger_count

mjaakko commented 1 year ago

Waltti might have a more robust solution for this, done in cooperation with University of Helsinki. See #topic-waltti-apc channel in Slack for discussion

teemu8655 commented 1 year ago

Helsinki university anonymizer is done https://github.com/DPBayes/apc-anonymizer