Validating the accuracy of acquisition predictions

Problem

We need to verify that the given date time prediction for the next acquisition time is accurate to some margin of error (ex. 5% or 10%).

Approaches

The first approach would be to enumerate through all possible path and row combinations, and assert that the previous data could accurately predict the most recently recorded acquisition time.
For example, for the path p and row r, if the have the acquisition times [t_n, t_n-1, ..., t₃, t₂, t₁], where t₁ represents the most recent acquisition time. We'll see if we can accurately predict t₁ given t₂ to t_n. We assert this for all possible combinations of p and r.

After making sure of (1), then we can work on live-testing predictions paired with the notification system.
An approach we could take for this is creating a worker service calling the prediction logic from the backend and having the worker service assert that the new acquisition image is taken within some margin of error of the predicted date. If a path and row combination has n correctly predicted values, then we can say that the predictions for that combination is accurate enough. We try to assert this for all possible combinations of paths and rows.

Dealing with the confidence value

In addition to the calculation of the predicted acquisition and publish dates, we also return a confidence value, which is a value bounded between [0, 1]. It represents how confident we are in our predicted date based on variance between previous dates (how consistent the intervals are between dates).

Current calculation

We calculate and normalize the timespan intervals, then clamp it to some maximum allowed variance. We then calculate the confidence value using the formula $1-(\dfrac{variance}{0.65 * maxVariance})^5$. The graph looks like the one below:

Problems

We need to figure out the variance bound where any variances past this value is considered pretty inaccurate. Given this information, we can make the confidence value skew downwards faster.

As of right now, a confidence value of >90% is considered pretty accurate - the margin of error of which is still unknown, but looks like a pretty accurate calculation just by looking at it.

However, the degree of inaccuracy for a confidence value is unknown.

rima1881 / Flat-Earthers-Backend