Hi, I am trying to execute the graphcast operational model with my own data and it seems to be a problem with the xarray object I build with operational data.
When I run an script that get the input_data from google cloud, it works just fine, and those data look like this:
(Pdb) eval_inputs
<xarray.Dataset>
Dimensions: (batch: 1, time: 2, lat: 721, lon: 1440,
level: 13)
Coordinates:
* lon (lon) float32 0.0 0.25 0.5 ... 359.5 359.8
* lat (lat) float32 -90.0 -89.75 ... 89.75 90.0
* level (level) int32 50 100 150 200 ... 850 925 1000
* time (time) timedelta64[ns] -1 days +18:00:00 00...
Dimensions without coordinates: batch
Data variables: (12/17)
2m_temperature (batch, time, lat, lon) float32 250.3 ... 2...
mean_sea_level_pressure (batch, time, lat, lon) float32 9.936e+04 ....
10m_v_component_of_wind (batch, time, lat, lon) float32 -0.4746 ......
10m_u_component_of_wind (batch, time, lat, lon) float32 -5.817 ... ...
temperature (batch, time, level, lat, lon) float32 238....
geopotential (batch, time, level, lat, lon) float32 1.98...
... ...
year_progress_sin (batch, time) float32 0.006986 0.01129
year_progress_cos (batch, time) float32 1.0 0.9999
day_progress_sin (batch, time, lon) float32 0.0 ... 1.0
day_progress_cos (batch, time, lon) float32 1.0 ... 0.004363
geopotential_at_surface (lat, lon) float32 2.735e+04 ... -0.07617
land_sea_mask (lat, lon) float32 1.0 1.0 1.0 ... 0.0 0.0 0.0
And when I build my xarray object looks like this:
(Pdb) input_data
<xarray.Dataset>
Dimensions: (lat: 721, lon: 1440, time: 2, level: 13,
batch: 1)
Coordinates:
* lat (lat) float64 -90.0 -89.75 ... 89.75 90.0
* lon (lon) float64 -180.0 -179.8 ... 179.5 179.8
* time (time) timedelta64[ns] -1 days +18:00:00 00...
* level (level) float64 50.0 100.0 ... 925.0 1e+03
* batch (batch) int64 1
Data variables: (12/16)
temperature (batch, time, lat, lon, level) float32 239....
u_component_of_wind (batch, time, lat, lon, level) float32 1.65...
v_component_of_wind (batch, time, lat, lon, level) float32 -14....
geopotential (batch, time, lat, lon, level) float32 1.98...
specific_humidity (batch, time, lat, lon, level) float32 3.09...
10m_v_component_of_wind (batch, time, lat, lon) float32 -0.6771 ......
... ...
mean_sea_level_pressure (batch, time, lat, lon) float32 9.939e+04 ....
toa_incident_solar_radiation (batch, time, lat, lon) float64 554.8 ... 0.0
year_progress_sin (batch, time) float64 -0.008601 0.0
year_progress_cos (batch, time) float64 1.0 1.0
day_progress_sin (batch, time, lon) float64 -1.0 -1.0 ... 0.0
day_progress_cos (batch, time, lon) float64 -1.837e-16 ... 1.0
The problem is that when I try to run the model with the rollout.chunked_prediction method with the eval_inputs data it works just fine, but when I use my input_data get the following error:
Traceback (most recent call last):
File "/home/eloy.anguiano/repos/graphcast/1.get_data.py", line 342, in <module>
predictions = rollout.chunked_prediction(
File "/home/eloy.anguiano/repos/graphcast/graphcast/rollout.py", line 68, in chunked_prediction
for prediction_chunk in chunked_prediction_generator(
File "/home/eloy.anguiano/repos/graphcast/graphcast/rollout.py", line 164, in chunked_prediction_generator
predictions = predictor_fn(
File "/home/eloy.anguiano/repos/graphcast/1.get_data.py", line 199, in <lambda>
return lambda **kw: fn(**kw)[0]
File "/home/eloy.anguiano/miniconda3/envs/graphcast_iic/lib/python3.10/site-packages/haiku/_src/transform.py", line 456, in apply_fn
out = f(*args, **kwargs)
File "/home/eloy.anguiano/repos/graphcast/1.get_data.py", line 165, in run_forward
return predictor(inputs, targets_template=targets_template, forcings=forcings)
File "/home/eloy.anguiano/repos/graphcast/graphcast/autoregressive.py", line 163, in __call__
self._validate_targets_and_forcings(targets_template, forcings)
File "/home/eloy.anguiano/repos/graphcast/graphcast/autoregressive.py", line 103, in _validate_targets_and_forcings
raise ValueError(f'Target variable {name} must be time-dependent.')
ValueError: Target variable geopotential_at_surface must be time-dependent.
I seems a bit strange as both datasets have that variable not time dependant, so I would like to know If there is anything else wrong with the data that raises this error by any chance. Here is the problematic variable at both variables:
Tutorial data
Hi, I am trying to execute the graphcast operational model with my own data and it seems to be a problem with the xarray object I build with operational data.
When I run an script that get the input_data from google cloud, it works just fine, and those data look like this:
And when I build my xarray object looks like this:
The problem is that when I try to run the model with the
rollout.chunked_prediction
method with theeval_inputs
data it works just fine, but when I use myinput_data
get the following error:I seems a bit strange as both datasets have that variable not time dependant, so I would like to know If there is anything else wrong with the data that raises this error by any chance. Here is the problematic variable at both variables: Tutorial data
My data
Could it be the longitude values that raises an uncontrolled error? Does anyone know any tip to continue?