Closed SorooshMani-NOAA closed 1 year ago
This is the result after the fix:
Note the difference with the previous plots. Here the start_date
and forecast_time
are the same. start_date
specifies the time to start perturbing, and forecast_time
specifies what advisory to look at.
@SorooshMani-NOAA I was going to ask about that difference between start_date
and forecast_time
. I don't understand what is meant by the time to start perturbing at, especially if start_date
is backwards in time from forecast_time
.
@WPringle I think we need to discuss how we want it to work. start_date
and end_date
are filters in stormevents
. The same thing is true for forecast_time
. The difference is that the first two filters based on forecasted time and the last one based on track start time. Now I just added forecast_time
to ensembleperturbation
by treating them like the other two, just as I did for stormevents
, but I didn't really realize the difference it can make in perturbations. I need to look deeper into the use of start_date
for the perturbation, maybe we just need to ensure that start_date
is always equal to forecast_time
whereever specified. What do you think?
Let me also play a little bit with the update and see if I find any new aspect of it.
@WPringle I took another look at the code, and I can summarize the impact of each date/time as follows.
As discussed previous comments, in stormevents.nhc.VortexTrack
:
start_date
and end_date
-> filter track-dataframe based on datetime
column (i.e. forecasted time)forecast_time
-> filters the dataframe based on track_start_time
columns (i.e. when forecast is issued)In ensembleperturbation
these date/times are passed to the stormevents.nhc.VortexTrack
object as is, so that the unperturbed track is filtered as described above. However, in addition to that the start_date
is used to calculate the validation_times
, taken to be self.track.data['datetime'] - self.track.start_date
, in other words the difference between datetime
column and the specified input start_time
.
With this description maybe from ensembleperturbation
point of view it might make sense to just get the start_date
as input and then use it as both start_date
and forecast_time
when passing to VortexTrack
from stormevents
to get the unperturbed track. What do you think?
Update
I also see a usecase for a perturb_time
input so that the perturbation doesn't start at the beginning of track. But that is not relevant directly to this current fix, just something to keep in mind as we think about adding arguments.
@SorooshMani-NOAA Thanks for the thorough investigation! I think that it would make sense that the validation_times
are computed from forecast_time
not start_date
, as that is the intended meaning. This would mean start_date
and end_date
could be used to filter a portion of the forecast from forecast_time
but would not be a necessary input if we just want the whole forecast.
Although I could see a use case of end_date
if we don't want to use all 168 forecast hours I can hardly think of a good use case of start_date
not being equal to forecast_time
, so I agree that the suggestion to just use start_date
equal to forecast_time
could be a good option. Then I suppose we don't actually need the forecast_time
kwarg.
I think these two paragraphs are a little in conflict, so I just agree that keep start_date
and end_date
and use start_date
to compute validation_times
and then all we need to do is just set forecast_time
= start_date
when getting the unperturbed track.
@WPringle I made the updates we discussed above, just one last thing. Should we still filter if no start_date
is passed? I think we should, but I just wanted to check with you to see if you can think of a test case where we shouldn't filter if it's not passed.
Not filtering results in the same issue we originally opened this ticket for
@SorooshMani-NOAA if no start_date
is passed then how would it know which advisory (forecast_time
) to use?
@SorooshMani-NOAA if no
start_date
is passed then how would it know which advisory (forecast_time
) to use?
I see, so I see you implemented the first possible start_date
. I guess this is fine, yeah. Or raise an error saying cannot do advisory a
deck without start_date
specified.
@WPringle I just noticed two major questions/issues with advisories here:
How should modified advisory files be handled in stormevents
and ensembleperturbation
? In the test_advisory_ensemble.py
test we use a modified advisory file (florence_advisory.22
) where the date tags for each forecast hour is different:
AL, 06, 2018091218, , OFCL, 0, 304N, 719W, 110, 943, HU, 34, NEQ, 170, 140, 100, 140, 1013, 0, 15, , , , , , 0, 0, , 1
AL, 06, 2018091218, , OFCL, 0, 304N, 719W, 110, 943, HU, 50, NEQ, 100, 80, 60, 80, 1013, 0, 15, , , , , , 0, 0, , 1
AL, 06, 2018091218, , OFCL, 0, 304N, 719W, 110, 943, HU, 64, NEQ, 60, 60, 40, 60, 1013, 0, 15, , , , , , 0, 0, , 1
AL, 06, 2018091221, , OFCL, 3, 309N, 725W, 105, 949, HU, 34, NEQ, 170, 140, 100, 140, 1013, 0, 15, , , , , ,314, 7, , 2
AL, 06, 2018091221, , OFCL, 3, 309N, 725W, 105, 949, HU, 50, NEQ, 100, 80, 60, 80, 1013, 0, 15, , , , , ,314, 7, , 2
AL, 06, 2018091221, , OFCL, 3, 309N, 725W, 105, 949, HU, 64, NEQ, 60, 60, 40, 60, 1013, 0, 15, , , , , ,314, 7, , 2
AL, 06, 2018091306, , OFCL, 12, 321N, 741W, 110, 943, HU, 34, NEQ, 170, 140, 100, 140, 1013, 0, 15, , , , , ,312, 6, , 3
AL, 06, 2018091306, , OFCL, 12, 321N, 741W, 110, 943, HU, 50, NEQ, 100, 80, 60, 80, 1013, 0, 15, , , , , ,312, 6, , 3
AL, 06, 2018091306, , OFCL, 12, 321N, 741W, 110, 943, HU, 64, NEQ, 60, 60, 40, 60, 1013, 0, 15, , , , , ,312, 6, , 3
AL, 06, 2018091318, , OFCL, 24, 334N, 759W, 110, 943, HU, 34, NEQ, 170, 140, 100, 140, 1013, 0, 15, , , , , ,311, 5, , 4
AL, 06, 2018091318, , OFCL, 24, 334N, 759W, 110, 943, HU, 50, NEQ, 100, 80, 60, 80, 1013, 0, 15, , , , , ,311, 5, , 4
AL, 06, 2018091318, , OFCL, 24, 334N, 759W, 110, 943, HU, 64, NEQ, 60, 60, 40, 60, 1013, 0, 15, , , , , ,311, 5, , 4
...
where as In the original ATCF file, all of them have the same datetag (i.e. track_start_time
) for each advisory issued. Most of the logic in stormevents
revolves around handling data downloaded from NHC, now the question is should we handle the modified ones (like the one above)? Ideally we want to be able to handle the modified files, so how about assuming .22
is modified and .dat
is unmodified?
When we output from an original (non-modified) advisory file (e.g. OFCL from Florence downloaded from NHC server) to a .22
format, we don't modify the date tags. For both SCHISM and ADCIRC, the date tags should be track_start_time + forecast_hours
, just like what we see for BEST
. This basically means we need to generate the modified advisory discussed in item 1. So we can follow the same suggested logic above and say for .22
files we output date tags that are modified (to comply with SCHISM and ADCIRC), but for .dat
we output datetags based on NHC convention.
This is a bit of a larger undertaking that I originally thought, but it is required in order to be able to use stormevents
and ensembleperturbation
for setting up non-BEST
track simulations.
As a side note, what I did in ondemand-storm-workflow
was to fake the tracks as being BEST
so that I don't run into the issues described above.
@WPringle I just noticed two major questions/issues with advisories here:
- How should modified advisory files be handled in
stormevents
andensembleperturbation
? In thetest_advisory_ensemble.py
test we use a modified advisory file (florence_advisory.22
) where the date tags for each forecast hour is different:AL, 06, 2018091218, , OFCL, 0, 304N, 719W, 110, 943, HU, 34, NEQ, 170, 140, 100, 140, 1013, 0, 15, , , , , , 0, 0, , 1 AL, 06, 2018091218, , OFCL, 0, 304N, 719W, 110, 943, HU, 50, NEQ, 100, 80, 60, 80, 1013, 0, 15, , , , , , 0, 0, , 1 AL, 06, 2018091218, , OFCL, 0, 304N, 719W, 110, 943, HU, 64, NEQ, 60, 60, 40, 60, 1013, 0, 15, , , , , , 0, 0, , 1 AL, 06, 2018091221, , OFCL, 3, 309N, 725W, 105, 949, HU, 34, NEQ, 170, 140, 100, 140, 1013, 0, 15, , , , , ,314, 7, , 2 AL, 06, 2018091221, , OFCL, 3, 309N, 725W, 105, 949, HU, 50, NEQ, 100, 80, 60, 80, 1013, 0, 15, , , , , ,314, 7, , 2 AL, 06, 2018091221, , OFCL, 3, 309N, 725W, 105, 949, HU, 64, NEQ, 60, 60, 40, 60, 1013, 0, 15, , , , , ,314, 7, , 2 AL, 06, 2018091306, , OFCL, 12, 321N, 741W, 110, 943, HU, 34, NEQ, 170, 140, 100, 140, 1013, 0, 15, , , , , ,312, 6, , 3 AL, 06, 2018091306, , OFCL, 12, 321N, 741W, 110, 943, HU, 50, NEQ, 100, 80, 60, 80, 1013, 0, 15, , , , , ,312, 6, , 3 AL, 06, 2018091306, , OFCL, 12, 321N, 741W, 110, 943, HU, 64, NEQ, 60, 60, 40, 60, 1013, 0, 15, , , , , ,312, 6, , 3 AL, 06, 2018091318, , OFCL, 24, 334N, 759W, 110, 943, HU, 34, NEQ, 170, 140, 100, 140, 1013, 0, 15, , , , , ,311, 5, , 4 AL, 06, 2018091318, , OFCL, 24, 334N, 759W, 110, 943, HU, 50, NEQ, 100, 80, 60, 80, 1013, 0, 15, , , , , ,311, 5, , 4 AL, 06, 2018091318, , OFCL, 24, 334N, 759W, 110, 943, HU, 64, NEQ, 60, 60, 40, 60, 1013, 0, 15, , , , , ,311, 5, , 4 ...
where as In the original ATCF file, all of them have the same datetag (i.e.
track_start_time
) for each advisory issued. Most of the logic instormevents
revolves around handling data downloaded from NHC, now the question is should we handle the modified ones (like the one above)? Ideally we want to be able to handle the modified files, so how about assuming.22
is modified and.dat
is unmodified?
- When we output from an original (non-modified) advisory file (e.g. OFCL from Florence downloaded from NHC server) to a
.22
format, we don't modify the date tags. For both SCHISM and ADCIRC, the date tags should betrack_start_time + forecast_hours
, just like what we see forBEST
. This basically means we need to generate the modified advisory discussed in item 1. So we can follow the same suggested logic above and say for.22
files we output date tags that are modified (to comply with SCHISM and ADCIRC), but for.dat
we output datetags based on NHC convention.This is a bit of a larger undertaking that I originally thought, but it is required in order to be able to use
stormevents
andensembleperturbation
for setting up non-BEST
track simulations.As a side note, what I did in
ondemand-storm-workflow
was to fake the tracks as beingBEST
so that I don't run into the issues described above.
If both SCHISM and ADCIRC require the dates to be updated then I guess it does make sense to have the fort.22
be the modified versions for compliance and the .dat
for the original style. Actually originally there is code inside ADCIRC Holland that should handle either BEST
or OFCL
in these different ways. Then with new holland model GAHM it only reads the forecast_hours
which is why I made that other PR for the best-track.
@WPringle thanks for testing. About the modified vs original format:
if both SCHISM and ADCIRC require the dates to be updated then I guess it does make sense to have the fort.22
I cannot speak for ADCIRC, but in SCHISM (PaHM) it seems that it doesn't care about either forecast_hours
or OFCL
vs BEST
it just reads the date tags. That's why I want to make the distinction to make it manageable. So in ADCIRC, if we have OFCL
but we also have differing date tags for different forecast hours within the same advisory (like BEST
) is it going to be a problem?
If it's a problem, we might need to make the formatting more explicit, e.g. argument for what to output, or something like that.
@SorooshMani-NOAA It may make sense to have a .22
option being for direct use in ADCIRC and .pahm
file or similar to be compliant with PaHM? Then we can be clear about which is compliant with which.
That makes sense. So should I leave .22
format as is and just update thing for .pahm
output?
@SorooshMani-NOAA Yeah if you think so. Does PaHM require a certain file suffix or can be any such as .pahm
?
No, the input file for PaHM needs to be specified in a separate file, in the PaHM-SCHISM case it is hardcoded to be hurricane-track.dat
. But .dat
is our original format which we don't want to touch.
By the way, I still need to modify the code to handle reading of .dat
vs .22
differently. So let's take a step back. In ADCIRC, when we have a modified track file, should the date tags be increasing (like BEST
) or it doesn't matter? Because without changing anything, right now we're running into issues (the currently failing test) with a modified .22
track file. So I have to add some code to handle different .22
cases as well (in terms of date tags, etc.)
@SorooshMani-NOAA The date tags should be increasing for BEST
and the date tags should be constant for OFCL
but with the forecast_hours
column filled out.
OK, so the advisory track that we have for this test is invalid then, right? https://github.com/noaa-ocs-modeling/EnsemblePerturbation/blob/13b58bb49af2c9b5a7de4775438695b9402cd656/tests/data/input/test_existing_advisory/florence_advisory.22#L1-L22
because we have both the increasing date/time tag as well as the forecast hours.
@SorooshMani-NOAA That's true it is invalid.
OK great! I think I can fix the file here and then merge this PR, then move back to stormevents
to add formatting needed for .pahm
, finally I'll modify ensembleperturbation
in a separate PR to replace .22
output for perturbations with .pahm
or something like that
Attention: Patch coverage is 96.49123%
with 2 lines
in your changes missing coverage. Please review.
Project coverage is 21.22%. Comparing base (
13b58bb
) to head (c67d045
). Report is 21 commits behind head on main.
Files | Patch % | Lines |
---|---|---|
ensembleperturbation/plotting/perturbation.py | 0.00% | 2 Missing :warning: |
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
The following issue will be addressed with this pull request: