As a user, I would like to declare each pool within an irregular sequence of pools for evaluation - Githubissues

epag commented 2 months ago

Author Name: James (James) Original Redmine Issue: 86646, https://vlab.noaa.gov/redmine/issues/86646 Original Date: 2021-01-08

Given the need to evaluate an irregular sequence of pools When I declare that in wres Then I would like to declare it in a single evaluation

Related issue(s): #78 Redmine related issue(s): 86840, 95229, 103868, 108096

epag commented 2 months ago

Original Redmine Comment Author Name: James (James) Original Date: 2021-01-08T21:58:07Z

Currently we allow for the generation of a regular sequence of pools using @leadTimesPoolingWindow@ and @issuedDatesPoolingWindow@, which are used to generate the sequence of @TimeWindow@ (or outer pools) to evaluate. A @TimeWindow@ has six dimensions to be configured. When undefined, a dimension defaults to unlimited.

In keeping with the current declaration, the new declaration might look something like this (for one pool, repeated as many times as pools), but I think we can improve on this nomenclature in the same way that we can improve on the current nomenclature, i.e., this is purely to clarify the ticket, not an actual suggestion of declaration.

<pool>
        <leadHours minimum="0" maximum="18" />
        <dates earliest="2017-08-07T00:00:00Z" latest="2017-08-10T00:00:00Z" />
        <issuedDates earliest="2017-08-07T23:59:59Z" latest="2017-08-08T23:00:00Z" />
</pool>
</code>

For example, it should be clarified when a bookend is inclusive or exclusive.

edit: This would be the declaration at most, since undeclared boundaries default to unlimited. Again, repeated for up to N pools.

This declaration could not coincide with any other pool boundary definitions (edit: such as @leadTimesPoolingWindow@ - although, to be discussed, since this is more about simplicity than impossibility, i.e., you could have a regular sequence, followed by an irregular one, I suppose).

epag commented 2 months ago

Original Redmine Comment Author Name: James (James) Original Date: 2022-06-13T16:18:38Z

This one came up again today in the context of a discussion about event detection. It would not involve a tremendous amount of effort but, equally, it may not expand our userbase to those who want to perform evaluations of hydrologic events and want to use WRES (the intersection of those two things seems quite small). Regardless, this is more generally useful, so bumping the priority a little.

epag commented 2 months ago

Original Redmine Comment Author Name: Hank (Hank) Original Date: 2022-06-13T16:34:12Z

Thanks for finding this. Maybe it can allow us to partially check a box, by allowing users to specify their own event periods for which to compute metrics. Then again, maybe not. More information about use cases would help. Still, as you said, this should be generally useful in other cases, so why not?

Hank

epag commented 2 months ago

Original Redmine Comment Author Name: James (James) Original Date: 2022-06-13T17:07:04Z

Indeed.

epag commented 2 months ago

Original Redmine Comment Author Name: James (James) Original Date: 2022-09-13T15:36:38Z

Bumping this to 6.8 since it keeps getting requested (or, rather, users keep asking for behavior that would be fixed by this).

That said, I think "8" may be a bit optimistic for handling the graphics etc., so the estimate may need to be reconsidered - this is not simply about declaration.

epag commented 2 months ago

Original Redmine Comment Author Name: Hank (Hank) Original Date: 2022-10-13T13:43:10Z

Slipping to 6.9.

I know you said you plan to work on the rescale lenience ticket, but this one might be a good candidate following that ticket.

Hank

epag commented 2 months ago

Original Redmine Comment Author Name: James (James) Original Date: 2022-10-13T13:49:59Z

Agree.

epag commented 2 months ago

Original Redmine Comment Author Name: James (James) Original Date: 2022-11-08T16:57:34Z

I think this is a good candidate for the next effort, probably not an especially significant one (although there's a risk I'm forgetting something, such as whether the existing graphics templates/styles can handle fully arbitrary pool sequences).

My main reservation is that it adds yet more to the declaration schema and will further propagate nomenclature that we probably want to dispense with, if only because consistency of the nomenclature will override accuracy/precision - when we change the nomenclature, we'll need to do it everywhere at once, I think.

I am tempted to make progress on a better evaluation-specific declaration language before implementing tickets that further complicate the existing language, but that is obviously a much bigger task and not something we've discussed or agreed as near-term work.

epag commented 2 months ago

Original Redmine Comment Author Name: James (James) Original Date: 2022-12-09T13:46:07Z

Putting this on the backlog until progress is made with the declaration language.

epag commented 2 months ago

Original Redmine Comment Author Name: James (James) Original Date: 2023-01-04T13:29:59Z

In practice, this will be a breaking change insofar as file names will change, which is part of the API because the time windows used in file names will need to be fully qualified, including for lead durations.

epag commented 2 months ago

Original Redmine Comment Author Name: James (James) Original Date: 2023-06-05T16:54:29Z

Makes sense to target v6.15 for this one, as the new declaration language will be live.

epag commented 2 months ago

Original Redmine Comment Author Name: Hank (Hank) Original Date: 2023-10-24T16:24:07Z

I'm delaying this to 6.17, as I doubt it will be done by the end of the week.

Hank

epag commented 2 months ago

Original Redmine Comment Author Name: Evan (Evan) Original Date: 2023-11-15T16:19:03Z

I missed this one, is this in 6.17? Is there UAT?

epag commented 2 months ago

Original Redmine Comment Author Name: Hank (Hank) Original Date: 2023-11-15T18:04:36Z

James:

Do you want to tackle this as part of 6.18 or slip it to the backlog?

Hank

epag commented 2 months ago

Original Redmine Comment Author Name: James (James) Original Date: 2023-11-15T18:17:51Z

In general, it's probably better to put everything on the backlog and then select from the backlog for the next release at the moment it can be nominated confidently, but it's OK to collapse that process in the really obvious cases. Otherwise, we're left performing musical chairs each time. In short, backlog.

epag commented 2 months ago

Original Redmine Comment Author Name: Hank (Hank) Original Date: 2023-11-15T18:26:55Z

Done,

Hank

james-d-brown commented 3 weeks ago

Looking at this one.

james-d-brown commented 3 weeks ago

First task is to determine the appropriate declaration. I am leaning towards the following in its most complete form (example with two pools):

time_pools:
  - lead_times:
      minimum: 1
      maximum: 6
      unit: hours
    reference_dates:
      minimum: 2551-03-17T00:00:00Z
      maximum: 2551-03-20T00:00:00Z
    valid_dates:
      minimum: 2551-03-18T00:00:00Z
      maximum: 2551-03-21T00:00:00Z  
  - lead_times:
      minimum: 7
      maximum: 12
      unit: hours
    reference_dates:
      minimum: 2551-03-21T00:00:00Z
      maximum: 2551-03-23T00:00:00Z
    valid_dates:
      minimum: 2551-03-22T00:00:00Z
      maximum: 2551-03-24T00:00:00Z

Or, in the more succinct flow style:

time_pools:
  - lead_times: { minimum: 1, maximum: 6, unit: hours }
    reference_dates: {minimum: 2551-03-17T00:00:00Z, maximum: 2551-03-20T00:00:00Z }
    valid_dates: { minimum: 2551-03-18T00:00:00Z, }, maximum: 2551-03-21T00:00:00Z  }
  - lead_times: { minimum: 7, maximum: 12, unit: hours }
    reference_dates: {minimum: 2551-03-21T00:00:00Z, maximum: 2551-03-23T00:00:00Z }
    valid_dates: { minimum: 2551-03-22T00:00:00Z, }, maximum: 2551-03-24T00:00:00Z  }

The repetition of the lead_time unit is a bit yucky. This is arguably a consequence of not using the ISO8601 standard for durations, which have the units embedded, but users find rather confusing (e.g., PT1H). Also, abstracting that out to a higher level or implementing a default may not be less confusing, since a time window is three-dimensional and this behavior is consistent with the overall declaration of lead_time bounds where the unit is required. Overall, I think we accept clarity at the expense of repetition for the lead duration unit.

james-d-brown commented 3 weeks ago

Each time pool is (up to) three-dimensional, with two bookends for each pool. When a dimension is undeclared, its defaults are taken from the relevant overall time bounds that are declared or the infinite interval, in that order.

Given the schema more generally, they do need to be declared in terms of a minimum and maximum, I think.

This will, however, introduce a slight complication insofar as the automatically generated pools are non-overlapping by design, so the lower bound of each pool is always exclusive. To ensure a consistent interpretation, regardless of origin, we'll need to adjust the declared minimum slightly, in code, to be the smallest possible temporal unit (that being a nanosecond) before the declared minimum. That way, the minimum is always inclusive, in practice, in keeping with the declaration, but the time windows have a consistent interpretation in code, regardless of whether they were explicitly declared or automatically generated.

james-d-brown commented 3 weeks ago

I will await any input and begin implementation tomorrow.

HankHerr-NOAA commented 3 weeks ago

James:

What you propose seems reasonable to me. Can you explain the interaction between the pools defined in time_pools and the top level pools? Would these pools just augment the list of pools created by the top level declaration?

Thanks, Hank

james-d-brown commented 3 weeks ago

I expect they will be additive to the automatically generated pools from lead_time_pools, reference_date_pools and valid_date_pools, but there will probably be a declaration warning. There may be situations where both make sense in the same evaluation, i.e., a regular sequence, together with an explicit pool or two, defined separately, so we probably shouldn't disallow.

HankHerr-NOAA commented 3 weeks ago

Sounds reasonable to me. Thanks!

Hank

james-d-brown commented 2 weeks ago

Each time pool is (up to) three-dimensional, with two bookends for each pool. When a dimension is undeclared, its defaults are taken from the relevant overall time bounds that are declared or the infinite interval, in that order.

Schema updated and some unit tests written for deserializing via the DeclarationFactory. Currently working on unit tests for the above part in the DeclarationInterpolator and will then make them pass.

james-d-brown commented 2 weeks ago

Have the failing tests, making them pass.

epag commented 2 weeks ago

Moving this to 6.28 unless its done and we end up needing an RC2

james-d-brown commented 2 weeks ago

Have the failing tests, making them pass.

Failing tests are now passing.

Adding a declaration validation warning and associated unit test when a user declares both a pool sequence and an explicit list of pools. This is allowed, but is potentially unintended.

james-d-brown commented 2 weeks ago

Adding a declaration validation warning and associated unit test when a user declares both a pool sequence and an explicit list of pools. This is allowed, but is potentially unintended.

Have that done. Will run system tests and then push progress.

james-d-brown commented 2 weeks ago

Question for the audience.

Consider the following declaration:

lead_times:
  minimum: 0
  maximum: 48
  unit: hours
lead_time_pools:
  period: 24
  unit: hours

This produces the following pool sequence: {(PT0S,PT24H],(PT24H,PT48H]}. In this notation, ( means that the boundary is excluded and ] means it is included. In other words, these two pools do not overlap, the first pool covers the first 24 hours and the second pool covers the second 24 hours - the 24th hour precisely is included in the lower pool, excluded from the upper pool.

Now, imagine the following declaration for explicit pools. Without thinking too carefully (i.e., rely on intuition, as a user probably would), how would you declare those two pools, replacing the metasyntactic variables with actual times (in hours)?

time_pools:
  - lead_times:
      minimum: foo
      maximum: bar
      unit: hours
  - lead_times:
      minimum: baz
      maximum: qux
      unit: hours

When answering, please don't look to see what anyone else wrote. Thanks!

HankHerr-NOAA commented 2 weeks ago

"metasyntatic"? Is that even a word?

In computer programming, a metasyntactic variable is a word or set of words used as a placeholder in source code.

Yes, it is. I don't recall learning that word in my computer science classes, though that was over 25 years ago. :)

If I'm understanding the question correctly...

time_pools:
  - lead_times:
      minimum: 0
      maximum: 24
      unit: hours
  - lead_times
      minimum: 24
      maximum: 48
      unit hours

Let me know if I misunderstood,

Hank

james-d-brown commented 2 weeks ago

"metasyntatic"? Is that even a word?

In computer programming, a metasyntactic variable is a word or set of words used as a placeholder in source code.

Yes, it is. I don't recall learning that word in my computer science classes, though that was over 25 years ago. :)

If I'm understanding the question correctly...
time_pools:
  - lead_times:
      minimum: 0
      maximum: 24
      unit: hours
  - lead_times
      minimum: 24
      maximum: 48
      unit hours
Let me know if I misunderstood,

Hank

Haha. Thanks, you understood correctly. I'll wait to see what Evan says but, for what it's worth, that is what I would intuitively have written too.

james-d-brown commented 2 weeks ago

Evan, could you take a look at this post when you get a minute, without reading any subsequent posts (other than this one :))?

https://github.com/NOAA-OWP/wres/issues/245#issuecomment-2429269334

Thanks!

epag commented 2 weeks ago

This produces the following pool sequence: {(PT0S,PT24H],(PT24H,PT48H]}. In this notation, ( means that the boundary is excluded and ] means it is included. In other words, these two pools do not overlap, the first pool covers the first 24 hours and the second pool covers the second 24 hours - the 24th hour precisely is included in the lower pool, excluded from the upper pool.

Now, imagine the following declaration for explicit pools. Without thinking too carefully (i.e., rely on intuition, as a user probably would), how would you declare those two pools, replacing the metasyntactic variables with actual times (in hours)?

I would think it something to be like the following, if I am understanding the question correctly (You want 2 pools, one going from 0-24 and one from 25-48):

time_pools:
  - lead_times:
      minimum: 0
      maximum: 24
      unit: hours
  - lead_times:
      minimum: 24
      maximum: 48
      unit: hours

I will say thought, if you had not included this description: {(PT0S,PT24H],(PT24H,PT48H]} the I may have been inclined to say the following:

time_pools:
  - lead_times:
      minimum: 0
      maximum: 24
      unit: hours
  - lead_times:
      minimum: 25
      maximum: 48
      unit: hours

This little snipped made me think that the min is not inclusive

james-d-brown commented 2 weeks ago

Thanks both. Again, FWIW, I would've done the same thing.

So here is the dilemma.

The words minimum and maximum were explicitly chosen throughout the declaration language to imply inclusive bounds. A minimum is the smallest possible value of a thing. The maximum is the largest possible value. Both are possible values.

However, when users declare pools, they generally don't want that interpretation, they want one of the bounds to be exclusive so that they can declare pools in the most intuitive way, which is what you both did. We cannot have data falling in two pools simultaneously, unless a user really intended that, and what both of you declared would cause this to occur, were we to take the strict definition of minimum and maximum used elsewhere.

This leaves a few possibilities, none of them ideal:

We strictly interpret the terms as inclusive. This is probably the worst option, given that it's contrary to intuitive use in this context.
We interpret the terms leniently in this context and document them. In other words, the minimum is exclusive and the maximum is inclusive. There are some advantages to this, namely it is consistent with the intuitive definition of a pool sequence and it is consistent with the way pools are generated elsewhere (which is why I included pool generation in the example). The downside is that the language of a minimum doesn't intuitively map to how it is interpreted.
We use different, more explicit terms. Specifically, we use minimum_exclusive and maximum. This is probably also a good option. It nudges users into thinking about what they are doing. The main downside is that it uses yet another term for a bookend and, so far, we have managed to use minimum and maximum everywhere. So it complicates the declaration language.
We retain the terms minimum and maximum and interpret them strictly, but we adjust the minimum value in code to be one instant lower than the declared value. This is probably not a good option, on reflection, because it leads to "magic", which can be confusing to users in edge cases where they think carefully and really do want their declared value to be inclusive.

Thoughts? I am leaning towards (2) or (3). For reasons of elegance, I lean slightly towards (2) of those options, but it does risk some confusion for those users who think about the meaning of the terms and how they are used elsewhere in the language. We are basically resorting to documentation of a counterintuitive feature for those users that are thinking hard. For the average user, it "just works". Option (3) has the virtue of being completely clear and straightforward, but it complicates the language and it is somewhat inelegant.

epag commented 2 weeks ago

Hmmmm, I see the issue. Personally im not a huge fan of min and max having difference inclusive and exclusive meaning. I can't think of anything better though so this is just me screaming into the void

james-d-brown commented 2 weeks ago

So you lean to option (3)?

epag commented 2 weeks ago

Id say very slightly. I totally get the added terms not being preferred, but one term being exclusive and one being inclusive feels a bit too tribal knowledge for my likes personally.

I don't have a super strong preference though to be clear

HankHerr-NOAA commented 2 weeks ago

I lean toward 2, though I'm not opposed to 3. I'm not as worried about the "tribal knowledge" aspect so long as its well documented. Perhaps if we get user feedback indicating at least one user was confused, I would think differently, but no one has voiced any confusion in the past. Though admittedly this new feature could lead to that confusion.

Thanks,

Hank

james-d-brown commented 2 weeks ago

Thanks both.

It's a close call, but I will start with (2), including documentation of how the minimum should be interpreted in this context, and will then codify this with (3) by calling it minimum_exclusive if users find it confusing. Fortunately, it would be a simple codification or clarification and not a breaking change because the interpretation with (2) is "minimum exclusive", but we retain the simpler language, initially, and migrate from documentation to more precise declaration language if needed.

james-d-brown commented 2 weeks ago

I think this one is done. Will perform some testing of various scenarios before merging in.

james-d-brown commented 2 weeks ago

This produces expected results, five windows of which the first three are generated from a sequence (lead_time_pools) and the last two are declared manually (time_pools). Obviously an artificial composition, but it works, and there is an expected warning too.

lead_times:
  minimum: 0
  maximum: 3
  unit: hours
lead_time_pools:
  period: 1
  unit: hours
time_pools:
  - lead_times:
      minimum: 3
      maximum: 4
      unit: hours
  - lead_times:
      minimum: 4
      maximum: 5
      unit: hours

2024-10-23T17:49:52.037+0000 23268 [main] WARN DeclarationValidator - Encountered 2 warning(s) when validating the declared evaluation: 
    - The declaration contained both an explicit list of 'time_pools' and an implicitly declared sequence of pools: ['lead_time_pools']. This is allowed, and the resulting pools from all sources will be added together. If this was not intended, please adjust your declaration.

DRRC2HSF_DRRC2HSF_HEFS_SAMPLE_SIZE

james-d-brown commented 2 weeks ago

Ah, needs some work to set the graphics options correctly to one of the pooling window types when there are explicit pooling windows whose date components (valid dates or issued dates) vary across pools. Currently, this is triggered by the presence of valid_date_pools and issued_date_pools alone.

james-d-brown commented 2 weeks ago

Better (in the presence of explicit time windows with varying valid dates and lead times):

DRRC2HSF_DRRC2HSF_HEFS_SAMPLE_SIZE

james-d-brown commented 2 weeks ago

Obviously, these plots are a bit limited, but that is nothing new and graphics are not the core output from WRES, rather numbers.

james-d-brown commented 1 week ago

This one is ready for UAT.

HankHerr-NOAA commented 1 week ago

I'm going to use the rest of my Friday to start testing this ticket. I'm going to begin with basic, in-memory evaluations of a simple RFC forecast for FROV2.

Hank

HankHerr-NOAA commented 1 week ago

A first evaluation attempt using just 'time_pools' was successful with two explicit pools. The output was as expected. However, when I combined it with the original declaration using 'lead_time_pools', I encountered a netCDF error. Here is the YAML snippet:

lead_time_pools:
  period: 6
  frequency: 6
  unit: hours
lead_times:
  minimum: 0
  maximum: 120
  unit: hours

time_pools:
  - lead_times:
      minimum: 12
      maximum: 18
      unit: hours
  - lead_times:
      minimum: 24
      maximum: 48
      unit: hours

Here is the exception:

2024-10-25T18:32:22.216+0000 3577936 [main] ERROR Main - Operation 'execute' completed unsuccessfully
wres.pipeline.InternalWresException: Could not complete project execution
        at wres.pipeline.Evaluator.evaluate(Evaluator.java:406)
        at wres.pipeline.Evaluator.evaluate(Evaluator.java:178)
        at wres.Functions.evaluate(Functions.java:135)
        at wres.Functions.evaluate(Functions.java:97)
        at wres.helpers.MainUtilities.call(MainUtilities.java:107)
        at wres.Main.completeExecution(Main.java:206)
        at wres.Main.main(Main.java:126)
Caused by: wres.pipeline.WresProcessingException: Encountered an error while processing evaluation '-X-aKK3-D2bEaf_Ds5StS4SLh1w': 
        at wres.pipeline.Evaluator.evaluate(Evaluator.java:835)
        at wres.pipeline.Evaluator.evaluate(Evaluator.java:377)
        ... 6 common frames omitted
Caused by: java.lang.IllegalStateException: Cannot write to /tmp/wres_evaluation_-X-aKK3-D2bEaf_Ds5StS4SLh1w/WRDS_AHPS_Streamflow_Forecasts_20241025T174044Z_48_HOURS.nc because it already exists.
        at wres.writing.netcdf.NetcdfOutputFileCreator2.create(NetcdfOutputFileCreator2.java:98)
        at wres.writing.netcdf.NetcdfOutputWriter.createBlobsAndBlobWriters(NetcdfOutputWriter.java:591)
        at wres.writing.netcdf.NetcdfOutputWriter.createBlobsForWriting(NetcdfOutputWriter.java:490)
        at wres.pipeline.EvaluationUtilities.createNetcdfBlobs(EvaluationUtilities.java:261)
        at wres.pipeline.Evaluator.evaluate(Evaluator.java:684)
        ... 7 common frames omitted

It appears to be due to one of the 'time_pools' ending at 48 hours while one of the lead_time_pools' also ends at 48 hours. I confirmed it by changing themaximumto 44 hours for the secondtime_pools` entry.

Just reporting the exception; I'm not positive what the best fix would be. If you need a fully reproduceable example, let me know. I'm hoping that you already have one because its late on a Friday. :) Thanks,

Hank

HankHerr-NOAA commented 1 week ago

This might have been commented on before (would need to look above), but its interesting how the declaration with two pools ending at 48 hours yields both points on the plot. I've attached the image, below. I think its correct, though I don't necessarily know how the WRES decides which 48 connects to 42 and which connects to the other 48. Maybe plotting the individual time_pools as disconnected points would be better? Maybe given the oddity of the evaluation, we shouldn't worry about it.

Regardless, the numbers and output look reasonable,

Hank

01631000_FROV2_WRDS_AHPS_Streamflow_Forecasts_SAMPLE_SIZE

HankHerr-NOAA commented 1 week ago

I'm done with my testing today. I need to pickup my kid and plan to do some email review and purging after I return. Next week, I'll add some different tests using different data, including multiple features, and test reference and valid date pools, as well.

Have a great weekend!

Hank

james-d-brown commented 1 week ago

I think the dots are connected when all the other attributes of the pool (and threshold) are identical, but the dots won't be joined if there are other things that vary. In any case, this is existing behavior, nothing special to time_pools. The netcdf error sounds like a bug, though, or rather a weakness in how we qualify that output type (insufficiently). Will require a more comprehensive file name, at least in these circumstances, I suppose.

NOAA-OWP / wres

As a user, I would like to declare each pool within an irregular sequence of pools for evaluation #245