refactoring for probabilistic forecasts in reports

wholmgren commented 4 years ago

We need to extend/refactor a handful of things for probabilistic forecasts. Let's keep things simple and restrict reports to only probabilistic or only deterministic.

A to do list that is almost certainly incomplete:

[x] Update report json metadata schema and api.process_report_dict parsing to allow for probabilistic forecasts. See #425
~ForecastObservation and ForecastAggregate need to be modified or copied for probabilistic forecasts. They currently have an attribute forecast: Forecast. Could be forecast: Union[Forecast, ProbabilisticForecast, ProbabilisticForecastConstantValue]. Trivial.~ ProbabilisticForecast, ProbabilisticForecastConstantValue are Forecast, so I don't think the attribute spec needs to change.
[x] reports.main.get_data_for_report assumes deterministic forecast. Trivial since api.get_values now understands probabilistic forecast types.
[x] api.get_values allows for ProbabilisticForecastConstantValues but not ProbabilisticForecast. Consider modifying. In fact, there is no convenience method to get all forecast values for all ProbabilisticForecastConstantValues within a ProbabilisticForecast. Probably will want that in api.
[x] metrics.preprocessing needs to support probabilistic forecasts. (it might already but it's not tested)
[x] solarforecastarbiter/metrics/calculator.py (almost complete in #224) needs to support probabilistic forecasts
[x] modify or create new template for probabilistic forecast reports. Modification probably gets us there for initial report.
[x] report forecast links need to go to /cdf/ or /cdf/single/
[x] simple plot with lines for each ProbabilisticForecastConstantValues within a report. Might require labels like in https://github.com/SolarArbiter/solarforecastarbiter-dashboard/pull/238/files#diff-ec34245434455b3669a7ee1e5f3187feR894-R898
[x] same as above, but for each ProbabilisticForecastConstantValues within a ProbabilisticForecast.
[x] simple plot with lines for ProbabilisticForecastConstantValues with axis='x'. Requires modification to at least plot y label.

related but address in separate issues:

plot with progressively shaded areas for matched probability bands (e.g. 0/100, 10/90, 20/80, etc. percentiles)
reliability diagram
rank histogram

let me know what I'm missing.

@alorenzo175 any opinions before I get started?

alorenzo175 commented 4 years ago

I agree with most of the to do list and will have to think on it more. I do want to refactor reports at some point, but shouldn't wait for that now. For the report metadata, lets add a type key or similar that can be probabilistic or deterministic and then metric validation etc can use that.

wholmgren commented 4 years ago

Do we also need a probabilistic_constant_value type? Assume that in the database a probabilistic forecast contains multiple percentiles and consider some cases:

user wants to know the Brier score for a single percentile.
user wants to know the Brier score for each percentile.
user wants to know the CRPS for the full forecast.

1 and 2 might be best specified by indicating

"forecast_type": "probabilistic_constant_value", 
"object_pairs: [{"forecast": uuid_of_constant_value, "observation"...}, ...]

and 3 by

"forecast_type": "probabilistic", 
"object_pairs: [{"forecast": uuid_of_full_probabilistic_forecast, "observation"...}, ...]

wholmgren commented 4 years ago

@awig @dplarson

awig commented 4 years ago

Summary of the metrics i/o requirements:

Metric	Obs	Ref (fx,fx_prob)	fx	fx_prob	Total # metrics
BS	yes		yes	yes	N_i or n
REL	yes		yes	yes	N_i or n
RES	yes		yes	yes	N_i or n
UNC	yes		yes	yes	N_i or n
CRPS	yes		yes	yes	1
BSS	yes	yes	yes	yes	N_i or n
SH			2		1 (or 1/2 N_i or n)

As you can see there are only a few special cases. Just to be sure we are on the same page:

Should Sharpness (SH) be calculated only for the furthest pair of probabilistic forecasts or should it be calculated for each symmetric pair of forecasts (10/90, 20/80, 30/70, ...)? Other ideas?
As discussed before, we agree that all probabilistic forecasts in a report have the same intervals and axis
As discussed before, we agree to drop any nan timestamps for all probabilistic forecasts
As discussed before, we agree that the reference probabilistic forecasts must have the same probabilistic intervals and axis as the non-reference probabilistic forecasts

wholmgren commented 4 years ago

N_i = Number of ProbabilisticForecastConstantValue in a ProbabilisticForecast? If so, I think there should be only 1 CRPS metric.

What's BD?

Users might want to know the Brier score for every ProbabilisticForecastConstantValue in a ProbabilisticForecast.

Users might want to know the Brier score for a similar ProbabilisticForecastConstantValue from many ProbabilisticForecast.

I don't think it's likely that people are going to want to know the score for every ProbabilisticForecastConstantValue from many ProbabilisticForecast. The multiplicity of that scenario is horrendous so it may be that the dashboard needs to make that evaluation tedious to request.

awig commented 4 years ago

@wholmgren Yeah, I believe you are correct for the CRPS.

BD is the Briar Decomposition which just makes up the REL, RES, UNC that in turn makes up the Briar Score. I'll remove it from the list since it isn't really a metric itself.

Regarding the comments about the BS, I'm not using the correct term as it can be $N_i$ or $n$ depending on what axis is given for the report. Does that make sense? (Noting that the axis can't be both in the report that is your last point. Am I understanding correctly?

@dplarson let me know if you agree, please

dplarson commented 4 years ago

The notation across probabilistic metrics isn't the best in general, so I'll avoid that in my replies:

I think you've got the main parts correct:

CRPS is computed based on all intervals at once (so one CRPS value per ProbabilisticForecast)
Brier Score, REL, RES, and UNC is computed per interval (e.g. 5 intervals ==> 5 Brier Score values)
Sharpness is computed between intervals, but the interval pairs do not need to be symmetric

Regarding Sharpness (SH): while computing this for each symmetric interval pair makes sense, there is currently no guarantee that the intervals provided will have symmetric pairs (e.g. the intervals could be 10, 25, 43, 91, 94, 95, 99). Therefore, one approach would be to have the user select the interval range they want SH computed over and only allowing one interval pair to be selected.

For Brier Score, REL, RES, and UNC, it should go something like this:

results = []
for each ProbabilisticForecastConstantValue in ProbabilisticForecast:
    obs, fx, fx_prob = transform(ProbabilisticForecastConstantValue)
    bs = brier_score(obs, fx, fx_prob)
    rel, res, unc = brier_decomposition(obs, fx, fx_prob)
    interval_name = ...   # how we're keeping track of the intervals
    results.append({"interval_name": interval_name, "bs": bs, "rel": rel, ...})

In this setup, if there are 10 intervals, then there should be 10 Brier Score values, 10 REL values, etc. And the report should then have figures show the 10 values per metric (e.g. a bar chart with one bar per metric value/interval).

And I agree with Will's point about challenges with multiple ProbabilisticForecast. If we're having the user set the interval pair for the Sharpness metric, maybe we can do the same for Brier Score, REL, etc. for multiple ProbabilisticForecast (i.e. if comparing multiple ProbabilisticForecast, then the user has to select one interval for comparing Brier Score, REL, etc.).

awig commented 4 years ago

Thanks, David. I think this gets to heart of the trouble I've having determining how to implement. This discussion on cleared up a few things about what is capable, but not about what is practical and what the user would intend to be implemented.

So, a couple of points are still bothering me:

What are the combinations of the ProbabilisticForecast and ProbabilisticForecastConstantValue that are possible.
Does the axis parameter (as I thought but now I'm a bit confuse) determine how the many-one or one-many relationship how to perform the calculations based with regards to theProbabilisticForecast and ProbabilisticForecastConstantValue ? So, with regards to the last question, I thought the axis parameter was to address David's last point about multiple ProbabilisticForecast and we don't provide any many-many relationship.
IMHO I think the sharpness index should be forced to be calculated on symmetric percentages only, otherwise I think we are asking the user to provide to much input and I can't really see any value of applying it to non-symmetric %s.

dplarson commented 4 years ago

Does the axis parameter (as I thought but now I'm a bit confuse) determine how the many-one or one-many relationship how to perform the calculations based with regards to theProbabilisticForecast and ProbabilisticForecastConstantValue ? So, with regards to the last question, I thought the axis parameter was to address David's last point about multiple ProbabilisticForecast and we don't provide any many-many relationship.

My understanding has been that the axis parameter tells you whether the forecast values in the ProbabilisticForecastConstantValue are a) the probabilities [%] for some event (e.g. probability that the power is less than or equal to 10 MW: prob(obs(t) <= 10 MW)) or b) the physical values that have a set probability (e.g. forecast the X MW at time t that has a probability of 40%). It does not refer to the index direction of a 2D array (i.e. axis="x" does not mean compute metrics down the rows and axis="y" does not mean computer metrics down the columns). But either way, the forecast in each ProbabilisticForecastConstantValue is still a time-series of timestamp,value (where the value is either a probability or a physical quantity).

Examples:

# forecast the probability for <= 10 MW; 
# values=probabilities
timestamp,value
2020-01-01 00:00:00,10  # probability of 10%
2020-01-01 01:00:00,15  # probability of 15%

# forecast the physical value corresponding to 
# a probability of 35%; values=physical values
timestamp,value
2020-01-01 00:00:00,732  # 732 W/m^2
2020-01-01 01:00:00,642  # 642 W/m^2

EDIT: in terms of the code, fx is the physical units forecasts (e.g. MW or W/m^2) while fx_prob is the corresponding probabilities. So either ProbabilisticForecastConstantValue will provide a) a vector of fx values corresponding to a single/constant fx_prob value or b) a vector of fx_prob values corresponding to a single/constant fx value.

awig commented 4 years ago

Thanks, but there is obviously some confusion still here. I did not at all mean that axis determines how to compute the index direction. I meant it was to determine the relationship between mapping ProbabilisticForecast to ProbabilisticForecastConstantValue. What I mean is that the relationship manyProbabilisticForecast to individual ProbabilisticForecastConstantValue with axis='x' or many ProbabilisticForecastConstantValue to inidvidual ProbabilisticForecast with axis='y'. But, we don't support many ProbabilisticForecast to many ProbabilisticForecastConstantValue from a report. I do understand how fx and fx_prob should map.

dplarson commented 4 years ago

Yeah, I agree I seem to have misunderstood your questions. Maybe this will be easier to work through during the call today.

awig commented 4 years ago

Ugh, I think I'm realizing my confusion is really the relationship of the ProbabilisticForecast to ProbabilsiticForecastConstantValue and how that relates with axis. I think I'm clear when there is only one value for the constant value, but not when there are many. I'll probably have to writeup a few pseudocode examples and have you guys review.

awig commented 4 years ago

OKAY

After having it beat into my skull, I understand now. Each ProbabilisticForecast will still have its timeseries of forecast_values once the data is pulled and to which it can have one or more ProbabilisticForecastConstantValue indicating its specific unit/% with the axis parameter. The report can still support multiple ProbabilisticForecasts of course.

Typical function:

Iterate through the ProbabilisticForecasts
Iterate through the ProbabilsiticForecastConstantValues
Compute score (except CPRS) using forecast_values and constant_values (transforming constant_values to timeseries as necessary) with mapping assigned by axis to actual compute function.

Also discussed,

all ProbabilsiticForecastConstantValue must have the same axis
Do not support sharpness, SH at this time
CRPS requires stacked 2-d inputs of both
RES requires getting unique forecast probabilities using _unique_forecasts via binning, may need some tuning

awig commented 4 years ago

@wholmgren @alorenzo175 Just to be sure, does a ProcessedForecastObservation with the timeseries forecast_value have to be associated with a unique constant_value in a ForecastObservation? Assuming I'm correct then. How do I know which constant_value to use? Right now I'm just assuming its in order but if there is a better way please let me know.

alorenzo175 commented 4 years ago

Yes, each contant_value will have an associated timeseries. I don't think we've written the code to get all the constant values timeseries yet, but I think something like a dict of constant_value (decimal): timeseries is reasonable.

wholmgren commented 4 years ago

@awig I'm going to nitpick your comment above so that we can be sure we're on the same page:

Each ProbabilisticForecast will still have its timeseries of forecast_values once the data is pulled

The "timeseries" data of a ProbabilisticForecast is two dimensional. @alorenzo175 suggested a dict. I think @dplarson might have done some prototyping in the fall with DataFrames. The metrics.probabilistic.crps function requires a 2D array.

and to which it can have one or more ProbabilisticForecastConstantValue

can --> must

indicating its specific unit/% with the axis parameter

The ProbabilisticForecast also has a variable and axis attribute. All ProbabilisticForecastConstantValue within a ProbabilisticForecast must have the same variable and axis attributes.

(transforming constant_values to timeseries as necessary)

I don't understand this. The constant_value (e.g. 10% or 10 MW) is not a time series.

Sorry for all the confusion. We should have written a doc page with examples for these objects. I can work on that later today if it would be helpful.

awig commented 4 years ago

Perfect thanks. I'll use this for my testing and for now will modify the code so that it creates this dict and can be substituted once that has been implemented.

awig commented 4 years ago

@wholmgren Glad to have it nitpicked. Important to avoid more confusion. Allow me to nitpick the nitpick.

The "timeseries" data of a ProbabilisticForecast is two dimensional. @alorenzo175 suggested a dict.

In my mind, ProbabilisticForecast does not have any (instead of "timeseries" ill use "series" of data). In the datamodel only ProcessedForecastObservationhas series. In particular forecast_values is listed as a pandas.Series. I don't see how it could be 2-d. What am I missing?

class ProcessedForecastObservation(BaseModel):
   ...
    forecast_values: Union[pd.Series, str, None]

I could change it to be pandas.DataFrame or a Tuple[Union[pd.Series, str, None]] but I don't enough to know that would work with the other models and api.

I think @dplarson might have done some prototyping in the fall with DataFrames. The metrics.probabilistic.crps function requires a 2D array

I've been transforming the data into a np.array(n,d) before sending to crps by gathering all ProcessedForecastObservation forecast_values where the original.forecast.id match. This shouldn't be an issue as long as I can plan for how the arrays come in.

can --> must

I'll add a check for >1, is 2 sufficient?

The ProbabilisticForecast also has a variable and axis attribute. All ProbabilisticForecastConstantValue within a ProbabilisticForecast must have the same variable and axis attributes.

Understood regarding the axis and variable. I had axis but not variable checks, so I'll add checks for both that all values for all ProbabilisticForecast have the same values for all ProbabilisticForecastConstantValues.

I don't understand this. The constant_value (e.g. 10% or 10 MW) is not a time series.

In order to perform the calculations it requires arrays of the same size as the forecast_values. E.g., if my forecast values are a series of [< 3.2MW, < 4.1MW, < 4.6MW] for a constant value of 10%, I need to create a matching series [10%, 10%, 10%] in order to pass to the probabilistic metric functions. @dplarson can correct if I'm mistaken.

Sorry for all the confusion. We should have written a doc page with examples for these objects. I can work on that later today if it would be helpful.

I say don't worry about it as I've already got most of the functionality in place I think and mainly working on the tests now. If there are still confusion after a get the PR, then maybe that would be a good time. The main issue I've struggled with is how to "extract" the right data from the appropriate datamodels to pass the calculations.

wholmgren commented 4 years ago

Ok, so ProcessedForecastObservation will only reference ProbabilisticForecastConstantValue objects?

I've been transforming the data into a np.array(n,d) before sending to crps by gathering all ProcessedForecastObservation forecast_values where the original.forecast.id match. This shouldn't be an issue as long as I can plan for how the arrays come in.

Sounds reasonable to me. @alorenzo175?

can --> must I'll add a check for >1, is 2 sufficient?

I was trying to say >= 1 instead of >= 0. But it turns out ProbabilisticForecast can be constructed without any constant_values so 0 is actually an option. Not sure if that's a bug.

Understood regarding the axis and variable. I had axis but not variable checks, so I'll add checks for both that all values for all ProbabilisticForecast have the same values for all ProbabilisticForecastConstantValues.

The ProbabilisticForecast constructor checks axis but not variable. Maybe better to add the check there?

alorenzo175 commented 4 years ago

Sounds reasonable to me. @alorenzo175?

Yes, transforming is reasonable.

Ok, so ProcessedForecastObservation will only reference ProbabilisticForecastConstantValue objects?

Will we be supporting comparing each ProbabilisticForecastConstantValue with a different observation? If not, it make make sense to make a separate ProcessedProbabilisticForecastObservation (with forecast_values either a dataframe or another model with the constant value and series) and avoid duplication

wholmgren commented 4 years ago

Will we be supporting comparing each ProbabilisticForecastConstantValue with a different observation?

I think this is required if we want to support probabilistic forecast comparisons across sites. So I'd say yes.

awig commented 4 years ago

Ok, so ProcessedForecastObservation will only reference ProbabilisticForecastConstantValue objects?

Not directly, but through the ProbabilisticForecast.

awig commented 4 years ago

I was trying to say >= 1 instead of >= 0. But it turns out ProbabilisticForecast can be constructed without any constant_values so 0 is actually an option. Not sure if that's a bug.

Okay so maybe I'll forego adding tests for those checks for now

wholmgren commented 4 years ago

Not directly, but through the ProbabilisticForecast.

Then I'm confused again. Here's a link a ProbabilisticForecast. The data associated with it is 2-D: one dimension is the constant value, the other is the time series for each constant value. So if a ProcessedForecastObservation references a ProbabilisticForecast then it is unclear what its forecast_values attribute actually refers to.

awig commented 4 years ago

Then I'm confused again. Here's a link a ProbabilisticForecast. The data associated with it is 2-D: one dimension is the constant value, the other is the time series for each constant value. So if a ProcessedForecastObservation references a ProbabilisticForecast then it is unclear what its forecast_values attribute actually refers to.

So it makes sense to me that each ProbabilisticForecastConstantValue woould have an associated series with it. But, I don't see any methods for pulling it so I believe you need to have ProcessedForecastObservation to realize the series based on the dates resolved from it. If there are helper functions to realize that, please let me know. Right now I don't see any probabilistic forecast in the reports that would give me an example of how this works, but if there is somewhere that would be great.

wholmgren commented 4 years ago

The API currently only supports fetching the values for a ProbabilisticForecastConstantValue https://github.com/SolarArbiter/solarforecastarbiter-core/blob/e193f5d88e5783b2e8ace381c7ccc67996bf58e8/solarforecastarbiter/io/api.py#L533-L565

Right now I don't see any probabilistic forecast in the reports that would give me an example of how this works, but if there is somewhere that would be great.

We need your code to make this possible.

awig commented 4 years ago

Fair enough.

After looking at api.get_values() I see that that above function is being called if the value is a ProbabilisticForecastConstantValue, so now I understand the "data" returned from main.get_data_for_report() will be populated with the Series from that function. Not sure how I'll handle all that yet. But, I will play around to see in what exact form the data exists so I can wrap my head around it, should have just done that at the beginning.

SolarArbiter / solarforecastarbiter-core

refactoring for probabilistic forecasts in reports #266