ukhsa-collaboration / covid19-app-system-public

COVID19 app backend
Other
202 stars 53 forks source link

What is riskThreshold? #31

Closed sourcejedi closed 3 years ago

sourcejedi commented 3 years ago

The documentation states a threshold of 15mins at 2 metres. However the "API Mode 2" code has never looked consistent with that. It's as if there's either a weirdly simple bug in the app, or the public documentation is quite misleading.

Am I missing something here?

https://covid19.nhs.uk/risk-scoring-algorithm.html

For each encounter, a score is calculated as follows:

  • Estimated within 1m of the other device: total time spent within 1m
  • Estimated 1m or beyond from the other device: sum of [total time at each distance / distance squared]

The risk threshold for the app has been set to identify high-risk encounters based on [...] where an individual has been within 2 metres of someone who has tested positive for Coronavirus for at least 15 minutes.

(15min 60) / (2m 2m) = 225 points. But the risk threshold is 100 points, not 200:

  "v2RiskCalculation": {
    "daysSinceOnsetToInfectiousness": [0,0,0,0,0,0,0,0,0,1,1,1,2,2,2,2,2,2,1,1,1,1,1,1,0,0,0,0,0],
    "infectiousnessWeights": [0.0,0.4,1.0],
    "reportTypeWhenMissing": 1,
    "riskThreshold": 100
  }

https://github.com/nihp-public/covid19-app-system-public/blob/b0871e684c526/src/static/exposure-configuration.json#L86

I've chased through the public app code. I can't judge how it determines the distance, but I can look at what it does with the result. It looks very consistent with the algorithm above.

(Note: the above applies during the 6 peak days where someone is considered "100% infectious". During the other 9 days of the window, they are considered "40% infectious", so the threshold would be about 17 minutes instead of about 7.)

nhs-covid19 commented 3 years ago

Thanks for your interest in the NHS Covid-19 project

Determining whether an encounter was or was not risky is a binary classification problem. To be able to determine an appropriate discrimination threshold, field tests were conducted where several participants with devices were spaced at certain distances for a certain duration. Measuring manually the distance and duration of each encounter, the encounter can be classified as risky or not risky (ground truth) using the duration (seconds) / distance^2 formula - i.e., if the ground truth score >= 225 then the encounter is classified as risky.

The risk score library (Android, iOS) takes the EN API Exposure Window data and calculates a score using an Unscented Kalman Smoother. The calculated score and the ground truth classification (risky/not risky) can be used to generate a ROC curve which measures the performance of the classification. The threshold is chosen based on what is deemed to be an acceptable rate of true and false positives.

Note: the app multiplies the risk score output by the library by a factor of 60.

The docs may be unintentionally slightly incorrect in that the app doesn't really use this duration/distance^2 formula, it uses the much more complicated algorithm in the risk score library, but the threshold was calibrated based off this formula - thanks for bringing this to our attention, we will correct this in a future release.

sourcejedi commented 3 years ago

Thanks for your great response, this was bothering me. It would be nice to have docs that rule out this suspicion. I thought it might be something like you say, except I saw minimumDistance is 1.0, consistent with "1 metre" in the ground truth test.

So I'm curious, has it been considered whether reducing minDistance improves the ROC curve? (increases area under the curve?) Or does reducing it break something?

(I did notice the divide by 60 in the library, and the multiply by 60 in the app. I didn't think it was interesting enough to mention. I was convinced the two should cancel out, and there's a "TODO" comment on the former. Maybe I missed something again. Or maybe it's a case of "if ain't broke, don't fix it".)