google / exposure-notifications-server

Exposure Notification Reference Server | Covid-19 Exposure Notifications
https://www.google.com/covid19/exposurenotifications/
Apache License 2.0
2.43k stars 311 forks source link

EXPOSURE Service crashes if STATS_UPLOAD_MINIMUM envar is set to less than 10. #1535

Closed ae-wehealth closed 3 years ago

ae-wehealth commented 3 years ago

TL;DR

EXPOSURE Service crashes if STATS_UPLOAD_MINIMUM envar is set to less than default value. I couldnt set it to without code modification.

Expected behavior Working Exposure service was expected.

  1. The single service initialization failure shall not render the entire api unusable. Other endpoints should still work.
  2. The stats should be calculated daily even for zero values or very low values if enabled.

Observed behavior The container doesnt start and crash due to failed verify routine https://github.com/google/exposure-notifications-server/blob/main/internal/publish/config.go#L111

Reproduction

Set the STATS_UPLOAD_MINIMUM to 1 and start the exposure service. It will fail due to the code conditions.

Additional information I was working on enabling key server stats on the verification server.

mikehelmick commented 3 years ago

This is working as intended to prevent potential tracking of individuals when there is low throughput.

stats for days that don’t meet the threshold are released after 48 hours.

ae-wehealth commented 3 years ago

How can the individuals be tracked this way if the stats are pulled by the verification server and from the verification server? There is no identifiable info in the https://admin.verification.api.wehealth.org/stats/realm/key-server.json response.

ae-wehealth commented 3 years ago

btw I had to set STATS_EMBARGO_PERIOD to 0 too because we do want the most up-to-date statistics.

sethvargo commented 3 years ago

This minimum embargo is required to maintain privacy in the system. With a small stats window, a PHA could issue codes at specific times and then correlate whether those codes had been redeemed back to individual users. This was Mike's point about "potential tracking of individuals when there is low throughput". The identifiable information is the timestamp.

Statistics are designed to be retroactive and in-aggregate. The statistics are intentionally not in-real-time to preserve privacy.

mikehelmick commented 3 years ago

Setting STATS_EMBARGO_PERIOD to 0 means that days below the STATS_UPLOAD_MINIMUM will never be released (it disabled the feature altogether).

The values here are the absolute minimums recommended by Google and authorized by our internal review process.