Closed anolive closed 4 years ago
Sure, I know what's happening.
The sample data file is 1991-2010. By default, the base period for the percentile calculations is 1961-1990. You need to specify what your base period is for the percentile threshold. For example, to use whatever data you have in you input files you should use the flag --base=1991-2010
. However 10 years is not a very good sample of the distribution of temperature. I recommend a period of 30 years. If your analysis period is not long enough, you can specify a separate set of files that can be used to calculate the percentile thresholds by using some command line options.
The documentation talks about a future simulation and a historical base period because that's what I was working on at the time, but the future simulation may not necessarily be model data or in the future.
--bpfx=FILE Indicates a future simulation, specifying a tmax file
containing the historical base period to be used
--bpfn=FILE Indicates a future simulation, specifying a tmin file
containing the historical base period to be used
Hi, Thank you for your so fast reply!
I used the flag and it was just fine! I actually not interested in this 'climpact2.sampledata.gridded.1991-2010.nc', I have used it to test the module because it is usually a faster file to run.
I am analyzing all the Europe using the EOBS dataset, and I have already computed the HW aspects with R-based climpact2 (it took 12 days to compute them...), but I need daily outputs for additional research... Thanks again, I will run the module for EOBS and let you know how it goes!
Hi again, sorry to bother...
So I ran the mask script over night (europe's eobs dataset) and all went ok. I had to change variable name according to my eobs.nc file (it's 'tx' instead of 'tmax')
# Load file ncfile = nc.Dataset('climpact_europe_eobsv19.nc','r') data = ncfile.variables['tx'][:]
Now when I try to run the ehf module, I also changed the variable name in the ncio.py
`def load_bp_data(options, timedata, variable='tx', mask=None): """load_bp_data loads the tmax or tmin data for the baseperiod provided in options."""
if variable=='tx':
if options.bpfx:files = options.bpfx
else: files = options.tmaxfile
varname = options.tmaxvname
elif variable=='tn':
if options.bpfn: files = options.bpfn
else: files = options.tminfile
varname = options.tminvname`
And also on the ehfheatwaves.py
# Load the temperature data over the base period if options.keeptave or options.keeptmax: tmax = ncio.load_bp_data(options, timedata, variable='tx', mask=mask) if options.keeptave or options.keeptmin: tmin = ncio.load_bp_data(options, timedata, variable='tn', mask=mask) if options.keeptave: tave_base = (tmax + tmin)/2.
Still, when I run the code, this is what I get
python3 ehfheatwaves.py -x "climpact_europe_eobsv19.nc" -n "climpact_europe_eobsv19.nc" -m "mask.nc" --dailyonly Traceback (most recent call last): File "ehfheatwaves.py", line 311, in <module> tmax = ncio.load_bp_data(options, timedata, variable='tx', mask=mask) File "/home/ana/Python1/ehfheatwaves-master/ncio.py", line 174, in load_bp_data temp = tempnc.variables[varname][(options.bpstart<=bpdates.year)&(bpdates.year<=options.bpend)] KeyError: 'tmax'
I am not sure if I made all the necessary changes or too few, or incorrect....
The variable names, as read in Panoply are >
short tx(time=25202, latitude=465, longitude=705);
:standard_name = "air_temperature";
:long_name = "maximum temperature";
short tn(time=25202, latitude=465, longitude=705);
:standard_name = "air_temperature";
:long_name = "minimum temperature";
Another issue is the units, EOBS is in Celsius instead of Kelvin.
Thanks again for your time. Ana
There shouldn't be a need to change the hard coded default variable names. Variable names can already be specified with command line options.
--vnamex=STR tmax variable name
--vnamen=STR tmin variable name
--vnamem=STR mask variable name
--vnamet=STR time variable name
Because the heatwave indices are all relative measures, the units of the input data shouldn't matter. The output would simply have the same units as the input. You just need to be aware of that when reading output metadata.
You're right, it's running. I will see how it goes and let you know
Hi again. So the script was running for the past five days, and finished today with the following message. I am not able to open the resulting netCDF file, although it is more than 40GB in size! I will try additional software alternatives to open it an inspect the results. What do you think of the message?
ana@ana:~/Python1/ehfheatwaves-master$ python3 ehfheatwaves.py -x "climpact_europe_eobsv19.nc" -n "climpact_europe_eobsv19.nc" -m "mask.nc" --vnamex=tx --vnamen=tn --vnamet=time --dailyonly
/home/ana/.local/lib/python3.6/site-packages/pandas/compat/_optional.py:106: UserWarning: Pandas requires version '1.2.1' or newer of 'bottleneck' (version '1.2.0' currently installed).
warnings.warn(msg, UserWarning)
Traceback (most recent call last):
File "ehfheatwaves.py", line 439, in
It's trying to write data to file but failing for some reason. Without more information I can't say for sure. How large is your input file? Can you check that you have enough space to create large files on your hard disk? The daily output file will have 3 variables of the same size as the input file, so it will be larger than the input file by at least a third. You could be running out of virtual/swap memory. How much virtual/swap memory do you have? You could try splitting your input file into smaller size files and running ehfheatwaves.py on each of those (with the same base period). Be aware the the first 28 days of each output file would be missing.
The problem could also be a non-standard structure of the input data. Can you please show me the header of the input file? (ncdump -h
Also the mask creation script I provided is only intended as an example of how to create a mask. I dont know enough about your data to say whether or not it works. I would double check that it is actually creating a land sea mask at all, rather than simply 1s everywhere.
Hi, thanks again for your help.
So, I checked the available space and indeed that might have been the problem since it is not much larger than the 50GB's of the input file. I am solving this issue to assure 150GB's of free space.
Anyway these are my free -m>
Mem: 15971 1741 12684 119 1545 13792 Swap: 537871 0 537871
The second issue was the format, so this is the ncdump output> ncdump -h climpact_europe_eobsv19.nc netcdf climpact_europe_eobsv19 { dimensions: time = UNLIMITED ; // (25202 currently) longitude = 705 ; latitude = 465 ; variables: double time(time) ; time:standard_name = "time" ; time:long_name = "Time in days" ; time:units = "days since 1950-01-01 00:00" ; time:calendar = "standard" ; time:axis = "T" ; double longitude(longitude) ; longitude:standard_name = "longitude" ; longitude:long_name = "Longitude values" ; longitude:units = "degrees_east" ; longitude:axis = "X" ; double latitude(latitude) ; latitude:standard_name = "latitude" ; latitude:long_name = "Latitude values" ; latitude:units = "degrees_north" ; latitude:axis = "Y" ; short tx(time, latitude, longitude) ; tx:standard_name = "air_temperature" ; tx:long_name = "maximum temperature" ; tx:units = "Celsius" ; tx:add_offset = 0.f ; tx:scale_factor = 0.01f ; tx:_FillValue = -9999s ; tx:missing_value = -9999s ; short tn(time, latitude, longitude) ; tn:standard_name = "air_temperature" ; tn:long_name = "minimum temperature" ; tn:units = "Celsius" ; tn:add_offset = 0.f ; tn:scale_factor = 0.01f ; tn:_FillValue = -9999s ; tn:missing_value = -9999s ; short rr(time, latitude, longitude) ; rr:standard_name = "thickness_of_rainfall_amount" ; rr:long_name = "rainfall" ; rr:units = "kg m-2 d-1" ; rr:add_offset = 0.f ; rr:scale_factor = 0.1f ; rr:_FillValue = -9999s ; rr:missing_value = -9999s ;
// global attributes: :CDI = "Climate Data Interface version 1.9.6 (http://mpimet.mpg.de/cdi)" ; :history = "Tue May 07 17:33:12 2019: cdo merge tx_ens_mean_0.1deg_reg_v19.0e.nc tn_ens_mean_0.1deg_reg_v19.0e.nc rr_kg_grid.nc clim_europe_eobsv19.nc\nMon Feb 18 12:08:50 2019: ncks -O -d time,0,25201 /data4/Else/EOBSv19.0e/Grid_0.1deg/tx/tx_ensmean_master_untilJan2019.nc /data4/Else/EOBSv19.0e/Grid_0.1deg/tx/tx_ensmean_master.nc" ; :Conventions = "CF-1.4" ; :E-OBS_version = "19.0e" ; :References = "http://surfobs.climate.copernicus.eu/dataaccess/access_eobs.php" ; :NCO = "netCDF Operators version 4.7.5 (Homepage = http://nco.sf.net, Code = http://github.com/nco/nco)" ; :CDO = "Climate Data Operators version 1.9.6 (http://mpimet.mpg.de/cdo)" ; }
Look's ok, right? Finally the mask, I have checked it in panoply, before running the ehfheatwave's script, and it was perfectly masking Europe's land surface versus the sea, so I don't think there is a problem there.
I also updated the bottleneck version to avoid that first error.
So I am cleaning up memory space and will re-running the script for the whole Europe. Hopefully I will have good news in 5 days...
Thanks, Ana
Hi, I am a PhD student totally inexperienced with coding or python.
While searching for a method to calculate EHF values on a daily basis for netcdf files, I came across your module.
However, I have not been able to run it as I receive the bellow error message after writing the following line in command (Linux 18.04, python 3.6)
python3 ehfheatwaves.py -x "climpact2.sampledata.gridded.1991-2010.nc" -n "climpact2.sampledata.gridded.1991-2010.nc" -m "mask.nc" --dailyonly
Traceback (most recent call last): File "ehfheatwaves.py", line 311, in
tmax = ncio.load_bp_data(options, timedata, variable='tmax', mask=mask)
File "/home/ana/Python1/ehfheatwaves-master/ncio.py", line 174, in load_bp_data
temp = tempnc.variables[varname][(options.bpstart<=bpdates.year)&(bpdates.year<=options.bpend)]
File "netCDF4/_netCDF4.pyx", line 4328, in netCDF4._netCDF4.Variable.getitem
File "/usr/local/lib/python3.6/dist-packages/netCDF4/utils.py", line 467, in _out_array_shape
c = count[..., i].ravel()[0] # All elements should be identical.
IndexError: index 0 is out of bounds for axis 0 with size 0
Do you have any insight on how to solve this or where to begin? I hope it is an easy answer for you, but if it is to complex to reply, thanks for your time anyway.