Add outage tracking - Githubissues

lboeman commented 2 years ago

Adds outage tracking for the system and reports.

Whole system outages are recorded using the admin cli.

Report outages are added by posting to a /report/<report_id>/outages endpoint.

Outages are recorded as simple objects with a unique identifier and start and end parameters.

lboeman commented 2 years ago

@wholmgren This is getting close. I think a good way to help shake out bugs is to implement the dashboard/core side of things and try to get the whole thing working.

Before I move on to that, I wanted to get your thoughts on a few things:

I've added a exclude_system_outages property to the report parameters that takes a boolean, and determines if we should exclude data from system outages during computation. Also added an "outages" key to the base report object that lists the report specific outages. Do you think that if the report_parameters have exclude_system_outages set to True, we should automatically include the system outages in the outages key?

Each outage object is very simple and has the form:

{
    "report_id": <uuiid>, #report outages only
    "outage_id": <uuid>,
    "start": <datetime>,
    "end": <datetime>,
    "created_at": <datetime>,
    "modified_at": <datetime>,
}

I think the algorithm for excluding the data will be something like:

determine a set of unique forecast parameters for a report
build an all-False boolean series with an index of submission times for the whole report based on each set of forecast parameters.
for each outage period flag submission time series between the bounds.
for each flagged submission time store a (start,end) tuple to drop from analyses based on the forecast lead time and run length.
Use the report's cumulative set of (start,end) forecast period to drop observation data from the analysis, and report totals in preprocessing results.

Do you think that's a reasonable way to go about it?

wholmgren commented 2 years ago

Do you think that if the report_parameters have exclude_system_outages set to True, we should automatically include the system outages in the outages key?

Is the idea that if exclude_system_outages=True, the report compute process would automatically pull the latest record of system outages? I thought we decided against implementing that for now, so I wonder if we really need that key at this point.

determine a set of unique forecast parameters for a report

In practice it might easier to just loop through the parameters for each forecast in a report instead of trying to be more efficient and only create the flagged series once for each parameter set. This would be more consistent with how preprocessing currently works.

for each flagged submission time store a (start,end) tuple to drop from analyses based on the forecast lead time and run length.

I thought we'd eventually need to get to a boolean series with this index:

https://github.com/SolarArbiter/solarforecastarbiter-core/blob/master/solarforecastarbiter/metrics/preprocessing.py#L52-L55

Unclear to me if that's part of your plan in the last 2 bullets.

Use the report's cumulative set of (start,end) forecast period to drop observation data from the analysis, and report totals in preprocessing results.

We might need to drop both observation and forecast data. Dropping the observation would be sufficient to ensure the metrics are not computed. But the time series graph should not contain either for periods that have been removed.

To the extent possible, we should avoid overlap between the removed points and the rest of preprocessing results. For example, don't report nighttime values removed for a day that's not analyzed. This might be tricky so consider follow up work for this.

lboeman commented 2 years ago

Is the idea that if exclude_system_outages=True, the report compute process would automatically pull the latest record of system outages? I thought we decided against implementing that for now, so I wonder if we really need that key at this point.

Yes, that was my idea. It seemed like it would be easier to add system and report outage tracking all in one shot, so I went ahead and added both of them in terms of storage and such. I can remove this parameter for now though, and we can use just report outages to start.

for each flagged submission time store a (start,end) tuple to drop from analyses based on the forecast lead time and run length.

I thought we'd eventually need to get to a boolean series with this index:

https://github.com/SolarArbiter/solarforecastarbiter-core/blob/master/solarforecastarbiter/metrics/preprocessing.py#L52-L55

Unclear to me if that's part of your plan in the last 2 bullets.

Yes. That is the last step of the plan, with the overall process being: outage-ranges-> forecast submissions times to drop -> corresponding forecast values to drop this intermediate list of start-ends is the step from forecast submission times to the corresponding time ranges to exclude. e.g. for a forecast submission time at 06:00 that falls within an outage period, a lead time of 1 hour and a run length of 3 hours we end up with (07:00, 10:00) start/end for that day. We'd also have to consider interval label here.

After getting all the forecast data ranges to drop we'd create a boolean series with the index you linked where between these (start,end) the values are dropped.

We might need to drop both observation and forecast data. Dropping the observation would be sufficient to ensure the metrics are not computed. But the time series graph should not contain either for periods that have been removed.

To the extent possible, we should avoid overlap between the removed points and the rest of preprocessing results. For example, don't report nighttime values removed for a day that's not analyzed. This might be tricky so consider follow up work for this.

It sounds like dropping both observation and forecast data prior to preprocessing will do this for us? I haven't looked carefully at where this would need to fit in in core, so I'm probably underestimating the complexity of implementing this.

SolarArbiter / solarforecastarbiter-api

Add outage tracking #325