Closed sentientc closed 1 month ago
The nominal time of the two steps taken as input, and the output step will always be multiple of 6 hours, and be separated by 6 hours.
E.g. the model may take as input midnight and 6pm, and will output 12 pm.
So you are saying the time variable outputted is not correct? From my understanding on the paper, lead time in your example is 6 hours(12pm - 6pm). To produce this on the notebook example, change cell "Extract training and eval data" to train_inputs, train_targets, train_forcings = data_utils.extract_inputs_targets_forcings( example_batch, target_lead_times=slice("0h", "6h"), **dataclasses.asdict(task_config))
eval_inputs, eval_targets, eval_forcings = data_utils.extract_inputs_targets_forcings( example_batch, target_lead_times=slice("6h", "18h"), **dataclasses.asdict(task_config))
This will result in the prediction.time=6,12,18,... So this prediction at 6 is actually 12? note: This first output actually have 0 lead time and it seems to also accept "0h" which is negative lead time.
the lead time angle doesn't seem to produce a prediction without phase difference. Is the time in prediction refers to state(t) used rather than its target(t+6 hour)?
Sorry, I get where you are getting, the name of the variables are a bit confusing here.
The time
coordinate (and also the values you pass to extract_inputs_targets_forcings
) is a relative timedelta
coordinate which refers to "time passed" since the initialization time.
When making a prediction, the input values for those should always be: -6h
and 0h
, and then the output values will always be 6h
,12h
, 18h
, 24h
, etc. regardless of the actual time of day.
How does this translate to an actual date and time? You mostly need to know which date you are making the forecast from (that is which date that this '0h' corresponds to, and then add the time
timedelta coordinate to that value to know which datetime a given prediction corresponds to).
If this is not clear, please take some screenshots of your data sequences before and after you call extract_inputs_targets_forcings
, I will explain in the context of that.
Hi, Thanks. really appreciate your answers. Don't have screenshot as I am away from my workstation. Hope this is suffice. I originally have "target_lead_times=slice('0h','72h',None)" but after reading your paper I realised that 0h lead time is not a reasonable setting for forecasting purpose so change to the current setting. Also, "0h" will probably mess up with the time as timedelta from the "0h". I am guessing the "6h" of train_input(source file's 6h?) is "0h" of eval_input so the first prediction time 6 hours after "2014-07-01 06:00:00 +0000z"
train_inputs, train_targets, train_forcings = data_utils.extract_inputs_targets_forcings(
example_batch, target_lead_times=slice('0h','6h',None),
**dataclasses.asdict(task_config))
#step=1: original timestep;step=2: 2*original timestep
eval_inputs, eval_targets, eval_forcings = data_utils.extract_inputs_targets_forcings(
example_batch, target_lead_times=slice('6h','72h',None),
**dataclasses.asdict(task_config))
Coordinates:
So typically you would always use :
input_duration="12h"
to indicate that your model needs two inputs
and
target_lead_times=slice('6h',f'{6*N}h')
, where N
is the number of steps that you want to predict.
For example, if you set N=4
then slice('6h','24h')
, then your model will require a sequence of length 6: 2 steps for the inputs, and 4 steps for the output.
So if you feed, e.g. a sequence that has 20 steps to extract_inputs_targets_forcings
, then will first crop everything but the last 6 steps, and then out of those 6 steps, it will return the first 2 as the inputs (with times -6h, 0
, and the renaming 4 as the targets, with times 6h, 12h, 18h, 24h
).
In order to know which actual dates this correspond to, you would look at the "datetime" coordinate of what was returned by extract_inputs_targets_forcings
.
Hi, Great. I will test that when I can. Cheers.
Hi, I tried jupyter notebook provided and the prediction seem to be a timestep(6hours?) ahead of source. In graphcast/rollout.py line 159 current_forcings = forcings.isel(time=target_slice) current_forcings = current_forcings.assign_coords(time=targets_chunk_time) The first prediction is target_slice=slice(0,1) and time_chuck_time=0 and in your paper forcing terms need forcing t-1,t,t+1, does this mean the prediction time is actually 0.5((0+1)/2) or 3am utc?