Open t3dixon opened 3 months ago
Hey Taylor,
Let me try to reproduce your results locally so that I can provide some insight. A few answers in principle, though:
predicted
dataset, there is a prediction within the baseline
dataset that has a corresponding valid time, and vice versa, otherwise that prediction is removed. Additionally, when declaring exact
the forecast reference times must match exactly. When declaring fuzzy
, the nearest reference times are used. This is useful, for example, when pairing RFC operational forecasts that are issued at non-synoptic times and comparing them to a baseline whose forecasts are issued at synoptic times. In this case, there will be an identical number of pairs for the predicted
and baseline
datasets after cross-pairing, even though the forecast issue times differ. More on cross-pairing here: https://github.com/NOAA-OWP/wres/wiki/Declaration-Language#15-are-there-any-other-optionsMy guess is that there are some peculiarities with your dataset that are leading to these odd results, such as all ensemble members having the same value, consistently. But I will try to reproduce locally as a starting point...
Cheers,
James
Reproduced.
The first thing I notice is these two warnings:
- The evaluation declares a 'time_scale', but the 'time_scale' associated with the 'observed' dataset is undefined. Unless the data source for the 'observed' dataset clarifies its own time scale, it is assumed that the dataset has the same time scale as the evaluation 'time_scale' and no rescaling will be performed. If this is incorrect or you are unsure, it is best to declare the 'time_scale' of the 'observed' dataset.
- The evaluation declares a 'time_scale', but the 'time_scale' associated with the 'predicted' dataset is undefined. Unless the data source for the 'predicted' dataset clarifies its own time scale, it is assumed that the dataset has the same time scale as the evaluation 'time_scale' and no rescaling will be performed. If this is incorrect or you are unsure, it is best to declare the 'time_scale' of the 'predicted' dataset.
In general, it is best to either declare the time-scale of the data inband to the data (i.e., to use a format that supports this, such as the CSV format) or to declare the time-scale of the data within the declaration itself. In the absence of this, an assumption will be made, as indicated in the warnings.
Such declaration may look like this (or whatever the data represents):
observed:
sources: HOPC1.QME.csv
variable: QME
type: observations
time_scale:
function: mean
period: 24
unit: hours
predicted:
sources: HOPC1.HEFS.tgz
variable: QINE
type: ensemble forecasts
time_scale:
function: mean
period: 1
unit: hours
This is followed up with warnings on the calculation of every single pool along these lines:
2024-08-19T16:51:52.674+0000 WARN PoolReporter [1/15] Completed statistics for a pool in feature group 'HOPC1-HOPC1'. The time window was: ( Earliest reference time: -1000000000-01-01T00:00:00Z, Latest reference time: +1000000000-12-31T23:59:59.999999999Z, Earliest valid time: -1000000000-01-01T00:00:00Z, Latest valid time: +1000000000-12-31T23:59:59.999999999Z, Earliest lead duration: PT0S, Latest lead duration: PT24H ). However, encountered 730 evaluation status warning(s) when creating the pool. Of these warnings, 730 originated from 'RESCALING'. An example warning follows for each evaluation stage that produced one or more warnings. To review the individual warnings, turn on debug logging. Example warnings: {RESCALING=EvaluationStatusMessage[LEVEL=WARN,STAGE=RESCALING,MESSAGE=While inspecting a time-series dataset with a 'predicted' orientation, failed to discover the time scale of the time-series data. However, the evaluation requires a time scale of [PT24H,MEAN]. Assuming that the time scale of the data matches the evaluation time scale and that no rescaling is required. If that is incorrect, please clarify the time scale of the time-series data. The time-series metadata is: TimeSeriesMetadata[timeScale=<null>,referenceTimes={},variableName=QME,feature=Feature[name=HOPC1,description=,srid=0,wkt=],unit=CFSD]]}.
Next, and I think this is the crux of the problem, I see only one ensemble member in the paired data. Why? Well, it seems that the predictions use the keyword ensemblemember
, but the actual/effective keyword is ensemblemember_id
:
https://github.com/NOAA-OWP/wres/wiki/Format-Requirements-for-CSV-Files
In other words, the software is interpreting every single forecast as a single-valued forecast because it is ignoring the ensemble information. The software is lenient about the presence of columns that do not coincide with expected keywords, but it will not use the information for those columns.
In short, I would start by fixing the keyword column in the header (ensemblemember
--> ensemblemember_id
), which should reintroduce the ensemble information.
Let me know if something above doesn't make sense.
Cheers,
James
Hi there!
I'm trying to mimic a typical HEFS evaluation, but some of my outputs seem to lack resolution. For example, the reliability, rank histogram, and ROC diagram plots only show 2 or maybe three points on the plots. Is there a way to increase the number of bins/points on these for these outputs?
Also, the cross-pair functionality does not seem to work when I run it. Can you help me better understand what this is doing? Pairing the forecasts with the observations appears to be working.
Note that I've commented out the baseline forecast information (and associated skill scores) because the ESP data were too large to attach here. Also, I've lowered the
minimum_sample_size setting
to 1 for testing, since I'm only using one year of forecast data.Thank you!
HOPC1.QME.csv HOPC1.HEFS.tgz wres-test-outputs.zip