Closed Arcomano1234 closed 3 months ago
Hi @Arcomano1234, it would be helpful if you could provide the example code here. When I checked the code, I couldn't find any latitude
values for the snow_depth
variable for the given year that is flipped.
I found in the below example that the value for the 1980 & 1981 is the same.
import xarray as xr
ar_full_37_1h = xr.open_zarr('gs://gcp-public-data-arco-era5/ar/full_37-1h-0p25deg-chunk-1.zarr-v3/')[['snow_depth']]
print(ar_full_37_1h.sel(time='1981-03-16T00:00:00.000000000').latitude.values)
print(ar_full_37_1h.sel(time='1980-03-16T00:00:00.000000000').latitude.values)
Hi thank you for your quick response! After further investigating it I realized the problem is more subtle than realized. The latitudes themselves are not flipped but the data itself is flipped. Here is the minimum reproducible code to this bug. The code below produces a matplotlib imshow image for the two dates in your example code (I also attached those images below). During these "flipped" dates I mentioned in the previous comment the outline of Antarctica and Greenland are in the incorrect locations (e.g., flipped over the equator).
import xarray as xr
import matplotlib.pyplot as plt
ar_full_37_1h = xr.open_zarr('gs://gcp-public-data-arco-era5/ar/full_37-1h-0p25deg-chunk-1.zarr-v3/')[['snow_depth']]
correct_orientation = ar_full_37_1h.sel(time='1980-03-16T00:00:00.000000000')['snow_depth'].values
incorrect_orientation = ar_full_37_1h.sel(time='1981-03-16T00:00:00.000000000')['snow_depth'].values
plt.imshow(correct_orientation)
plt.title('Correctly Orientated Data Example')
plt.show()
plt.imshow(incorrect_orientation)
plt.title('Flipped Data Example')
plt.show()
Hello @Arcomano1234 If possible please share the script with us which you used to generate the flipped data or run your script on all of the year(1940-2023
) so we can fix all the flipped data of the dataset.
I admit this is not the best method for detecting the flipped data but this is the quickest and most simple. I assumed that the top latitude band (e.g., the North Pole) had no snow depth after inspecting the data. So the script checks any time there is snow depth where there shouldn't be. This script was able to detect all of the problem data from 1979 - 2023, however, I can not vary if it works for older data.
import numpy as np
import xarray as xr
years = np.arange(1979,2022)
for year in years:
ds = xr.open_dataset(f'{year}.nc')
times = ds.time
for i in range(len(times)):
var = ds['snow_depth'][i,0,:].values
if np.mean(var) > 0.1:
print('Flipped data at',times[i].values)
https://github.com/google-research/arco-era5/issues/71 is a duplicate of this issue. See here for my code to reproduce: https://github.com/google-research/arco-era5/issues/71#issuecomment-2106227810
It appears that every variable has this issue on these dates.
Thank you for the update. I just checked and yes all of the variables I have used from ARCO-ERA5 dataset are flipped at the dates mentioned in my original post.
@Arcomano1234 @shoyer we were able to identify the root cause of the issue. The discrepancy arises from the fact that certain variables have a resolution of 0.5 * 0.5
degrees, whereas all the other variables have a resolution of 0.25 * 0.25
degrees. Consequently, during the creation of the dataset, the latitude value is reversed
, resulting in reversed data for that particular date.
The following variables have a spatial resolution of 0.5 * 0.5 degrees for the specified date.
'1965-11-22' -> wave_spectral_kurtosis
'1981-03-16' -> mean_wave_period_based_on_first_moment_for_swell
'1982-04-06' -> benjamin_feir_index
'1985-12-11' -> benjamin_feir_index
'1987-11-30' -> mean_wave_period_of_second_swell_partition
'1990-03-05' -> mean_direction_of_wind_waves
'1990-04-02' -> period_corresponding_to_maximum_individual_wave_height
'1990-08-12' -> mean_period_of_total_swell
'1997-05-15' -> significant_wave_height_of_third_swell_partition
'2002-03-17' -> peak_wave_period
'2003-11-26' -> mean_direction_of_wind_waves
'2004-02-10' -> mean_wave_direction_of_first_swell_partition
'2006-04-12' -> mean_wave_period_based_on_first_moment_for_swell
'2007-06-19' -> v_component_stokes_drift
'2009-03-05' -> wave_spectral_directional_width_for_wind_waves
'2013-11-11' -> mean_wave_period_of_third_swell_partition
'2014-05-11' -> significant_wave_height_of_third_swell_partition
'2017-03-17' -> mean_wave_period
'2020-05-19' -> benjamin_feir_index
P.S.: I already updated the .nc file for the above date for the above variables so you can't see the old files now. 😄
Here is an example code snippet for creating the datasets:
def fun():
year, month, day = 1981, 3, 16
root_path = pathlib.Path("gs://gcp-public-data-arco-era5/raw") # GCP path
output = {}
for variable in ['mean_wave_period_based_on_first_moment_for_swell', "mean_wave_period_based_on_first_moment"]:
relative_path = SINGLE_LEVEL_SUBDIR_TEMPLATE.format(year=year, month=month, day=day, variable=variable)
output[variable] = _read_nc_dataset(root_path / relative_path)
print(output[variable])
print("-------------------------")
final_answer = xr.Dataset(output)
print("final answer : ", final_answer)
P.P.S.: It should be noted that the .nc files have already been updated, so using the above code will yield accurate results (data is not flipped).
Furthermore, we will update the data of the zarr file (gs://gcp-public-data-arco-era5/ar/full_37-1h-0p25deg-chunk-1.zarr-v3/
) within the next 2-3 days.
@Arcomano1234 @shoyer I updated data of the zarr file(gs://gcp-public-data-arco-era5/ar/full_37-1h-0p25deg-chunk-1.zarr-v3/
). After a thorough examination, I have concluded that there are no flipped data points; therefore, I am closing this issue.
Please feel free to reopen the issue if you find any flipped data in the future.
Thanks @dabhicusp, really appreciated!
It seems that for a select few dates the
snow_depth
variable is flipped (e.g., the latitudes are reversed) from this datasetgs://gcp-public-data-arco-era5/ar/full_37-1h-0p25deg-chunk-1.zarr-v3/
. I only checked values in 6 hour chunks from 1979 to 2021, but I found the fields are flipped for these dates: