pik-copan / pyunicorn

Unified Complex Network and Recurrence Analysis Toolbox
http://pik-potsdam.de/~donges/pyunicorn/
Other
195 stars 86 forks source link

Error when selecting temporal window #167

Closed rimajj closed 2 years ago

rimajj commented 2 years ago

When I select "time_min" or "time_max" to be different from zero I get a Traceback Error in climate.ClimateData.Load(): ValueError: zero-size array to reduction operation minimum which has no identity

I tried many different NetCDF files, always getting the same error. The code attached can be used to reconstruct the error. If needed I happily provide also the NetCDF files I used.

temporal_window_error.txt

zugnachpankow commented 2 years ago

Dear @rimajj,

thanks for reporting this problem. I was able to recreate the error with the code you provided and as well with the tutorial for climate networks, as found in /pyunicorn/examples/tutorials/climate_network.py. I used the air.mon.mean.nc NetCDF4 file that is used in the tutorial, as you said this would happen with many NetCDF files.

The Problem (at least for this NetCDF4 file) is that "time_min" = 10. and "time_max" = 50., as you specify them, are not in the time window of the cdf4 file. Something like "time_min" = 1918248. and "time_max" = 1921128. works perfectly fine instead. I am not an expert in the cdf4 file structure, but apparently time is a bit awkward to use with this data structure: https://unidata.github.io/netcdf4-python/#dealing-with-time-coordinates

Please make sure that the time_min and time_max you define make sense in the context of the cdf4 files you are using and see if this might resolve your error. Please get back to me if this helped.

In any case I will leave this issue open for now, since we might prevent this problem with better documentation or a more detailed description in the tutorial for the next release. The way time_min and time_max are set to 0. might be confusing here. The way this works under the hood is that if time_min == time_max, simply the whole time range of the data is used. The value being 0. in the tutorial might make it seem as if this is an index, i.e. time_min = 10. means you use full_time[time_min:].

Cheers Max

rimajj commented 2 years ago

Dear @zugnachpankow , thank you for clarifying this. You are right, I was confused and somehow thought I could just enter the index of the time variable. But it makes totally sense that I have to provide the time in the actual units used in the NetCDF file. Should be clear, but I agree that clarifying that shortly in the documentation wouldn't harm anybody.