Closed michaelradhuber closed 8 years ago
Your suspicions could be correct! Sorry you are having trouble.
To start, does your script work for a single geometry? I'm not sure what your shapefile schema looks like, but could you try adding a geometry selection to the operations call?:
ops = OcgOperations(..., geom_select_uid=[<integer identifier>],
geom_uid=<string name of shapefile unique identifier>)
I would also be curious if an intersects operation returns the same error:
ops = OcgOperations(..., spatial_operation='intersects', aggregate=False)
While we're at it, could you send the metadata of one of your target netCDF files?
RequestDataset(uri, var).inspect()
Thanks!
Hi Ben! Thanks for the fast answer. I will check into 1) and 2) right away, meanwhile here are the metadata (I rewrote the NETCDF myself because the official one was not CF compliant - but I kept the globa attributes in there as they came, so the metadata became a bit longer).
Thanks again as I am completely stuck with this error and just don't see through it at all...
URI = snowdepth1999.nc
VARIABLE = snow
=== Temporal =============
Name = time Count = 365 Has Bounds = False Data Type = float64 Start Date = 1999-01-01 00:00:00 End Date = 1999-12-31 00:00:00 Calendar = 365_day Units = days since 1970-12-31 00:00:00 Resolution (Days) = 1
=== Spatial ==============
Spatial Reference = CFLambertConformal Proj4 String = +proj=lcc +lat_1=49 +lat_2=46 +lat_0=48 +lon_0=13.33 +x_0=400000 +y_0=400000 +ellps=WGS84 +units=m +no_defs Extent = (100450.0, 194912.0, 701450.0, 545912.0) Geometry Interface = SpatialGeometryPointDimension Resolution = 1002.26190476 Count = 210951
=== Level ================
No level dimension.
=== Metadata Dump ========
dimensions: time = ISUNLIMITED ; // 365 currently lon = 601 ; lat = 351 ;
variables: float64 time(time) ; time:units = "days since 1970-12-31 00:00:00" ; time:calendar = "365_day" ; float64 lon(lon) ; lon:units = "m" ; lon:long_name = "Longitude coordinates" ; lon:standard_name = "projection_x_coordinate" ; float64 lat(lat) ; lat:units = "m" ; lat:long_name = "Latitute coordinates" ; lat:standard_name = "projection_y_coordinate" ; float64 snow(time, lat, lon) ; snow:units = "cm" ; snow:missing_value = "9.96920996839e+36" ; snow:long_name = "Daily snow depth" ; snow:coordinates = "lat lon" ; snow:grid_mapping = "Lambert_Conformal" ; int32 Lambert_Conformal() ; Lambert_Conformal:grid_mapping_name = "lambert_conformal_conic" ; Lambert_Conformal:linear_unit = "Meters" ; Lambert_Conformal:false_easting = "400000" ; Lambert_Conformal:false_northing = "400000" ; Lambert_Conformal:longitude_of_central_meridian = "13.33" ; Lambert_Conformal:standard_parallel = "[49 46]" ; Lambert_Conformal:latitude_of_projection_origin = "48" ;
// global attributes: :title = STARTCLIM Gridmode daily snow depths, 1971-2006 ; :source = STARTCLIM station data ; :description = Date-reconverted dataset from ZAMG snow-depth data ; :history = Created datetime.datetime(2016, 1, 8, 19, 57, 46, 270059) ; :author = Michael Radhuber ; :institution = Johannes Kepler University, Dept. of Economics ; :contact = michael.radhuber@jku.at ; :references = www.econ.jku.at/radhuber ; :projection = Clarke 1866 Lambert Conformal Conic ; :projection_parameters_Linear_Unit = Meters ; :projection_parameters_False_Easting = 400000 ; :projection_parameters_False_Northing = 400000 ; :projectwikiion_parameters_Central_Meridian = 13.33 ; :projection_parameters_Standard_Parallel_1 = 46 ; :projection_parameters_Standard_Parallel_2 = 49 ; :projection_parameters_Latitude_of_Origin = 48 ; :Conventions = CF-1.0 ; :conventionsURL = http://www.unidata.ucar.edu/packages/netcdf/conventions.html ; :spatial_ref = PROJCS["MGI / Austria Lambert", GEOGCS["MGI", DATUM["Militar_Geographische_Institute", SPHEROID["Bessel 1841",6377397.155,299.1528128, AUTHORITY["EPSG","7004"]], TOWGS84[577.326,90.129,463.919,5.137,1.474,5.297,2.4232], AUTHORITY["EPSG","6312"]], PRIMEM["Greenwich",0, AUTHORITY["EPSG","8901"]], UNIT["degree",0.0174532925199433, AUTHORITY["EPSG","9108"]], AUTHORITY["EPSG","4312"]], UNIT["metre",1, AUTHORITY["EPSG","9001"]], PROJECTION["Lambert_Conformal_Conic_2SP"], PARAMETER["standard_parallel_1",49], PARAMETER["standard_parallel_2",46], PARAMETER["latitude_of_origin",47.5], PARAMETER["central_meridian",13.33333333333333], PARAMETER["false_easting",400000], PARAMETER["false_northing",400000], AUTHORITY["EPSG","31287"], AXIS["Y",EAST], AXIS["X",NORTH]] ;
print('returning numpy for a municipality...')
ops = OcgOperations(dataset=rdc, spatial_operation='clip', aggregate=True, snippet=SNIPPET, geom='/media/root/Elements/data/snow/shapefiles/gem2015/Gem_AT_2015.shp',
geom_select_uid=[61428], geom_uid='GEM_NR')
ret = ops.execute()
spatial_operation='intersects'
on all municipalities reproduced the same error. Okay, good, inspection output looks :+1:. Your CF formatting also looks good. I noticed "axis" attributes are missing from the coordinate variables. Ideally, coordinate variables would have:
float64 lat(lat):
lat:axis = "Y";
...
float64 lat(lon):
lat:axis = "X";
...
I don't think this is causing the issue, but I at least wanted to get it out there :smile:. Sorry to nitpick...
I suspect we may have trouble debugging this through the issue tracker. Is your shapefile a reasonable size? If so, it may make sense to mail the shapefile and a snippet of a netCDF to ocgis_support@list.woc.noaa.gov. You can get a netCDF snippet with:
rd = ocgis.RequestDataset(uri, variable)
# First time slice.
ops = ocgis.OcgOperations(dataset=rd, snippet=True, output_format='nc')
# The first 10 time slices would probably be okay too.
ops = ocgis.OcgOperations(dataset=rd, slice=[None, [0, 10], None, None, None], output_format='nc')
ret = ops.execute()
You can also point me towards download URLs if that works better!
Shapefiles: Find them above in the edits of the first post I added a little later on. netCDF snippet: attached ocgis_output.zip
Thanks for the axis metadata, I already added them :+1:
Thanks! I missed the edits. Unfortunately, I am not able to download the shapefile:
I'll try again later. I sent the webmaster a quick message. I was able to get the netCDF data though! I had totally forgotten about issue attachments.
From your experimentation, I think there is a municipality boundary that does not overlap any of the climate data "points" in the target netCDF. If you are okay with bounds interpolation/extrapolation to create cells/polygons, you could try setting interpolate_spatial_bounds=True
(http://ncpp.github.io/ocgis/api.html#interpolate-spatial-bounds).
Regardless, these empty subsets are supposed to raise an EmptySubsetError
so I think you hit upon a bug. I'd like to take a closer look at the shapefile to make sure and also to validate that coordinate system transformations are working as expected. Getting closer!
Sorry about that, try again to download the shapefiles directly here:
Got it! Thanks. Will be in touch.
This ran without errors only I am unable to find the output it should have produced....? Shouldn't it produce another variable or dictionary object?
print('returning numpy...')
ops = OcgOperations(dataset=rdc, spatial_operation='clip', aggregate=True, snippet=SNIPPET, geom='/media/root/Elements/data/snow/shapefiles/gem2015/Gem_AT_2015.shp', interpolate_spatial_bounds=True)
ret = ops.execute()
Yes, the variable ret
should be a dictionary-like SpatialCollection
object. Is ret
empty?
Actually a 'ret' dictionary object is produced (it doesn't show up in the variable explorer, so I didn't see it at once). But I can't interpret its contents. It contains the variable name, the date, and another object field that I don't understand. Shouldn't that third field be some kind of array containing a new municipality ID (from the shapefile) for every data point of snow given lat, lon and time? Why isn't it printing correctly? Or am I getting it wrong and the array is in there, but masked? (Then of course, how do I make use of it??)
print('returning numpy...')
ops = OcgOperations(dataset=rdc, spatial_operation='clip', aggregate=True, snippet=SNIPPET, geom='/media/root/Elements/data/snow/shapefiles/gem2015/Gem_AT_2015.shp', interpolate_spatial_bounds=True)
ret = ops.execute()
returning numpy...
Then if I do print(ret)
:
print(ret)
SpatialCollection([(1, OrderedDict([('snow', <ocgis.interface.nc.field.NcField object at 0x7fa270894110>)])), (2, OrderedDict([('snow', <ocgis.interface.nc.field.NcField object at 0x7fa290c92c50>)])), (3, OrderedDict([('snow', <ocgis.interface.nc.field.NcField object at 0x7fa261579a90>)])), (4, OrderedDict([('snow', <ocgis.interface.nc.field.NcField object at 0x7fa2615bef90>)])), (5, OrderedDict([('snow', <ocgis.interface.nc.field.NcField object at 0x7fa2615c4490>)])), (6, OrderedDict([('snow', <ocgis.interface.nc.field.NcField object at 0x7fa2709c7b10>)])), (7, OrderedDict([('snow', <ocgis.interface.nc.field.NcField object at 0x7fa2709a5e50>)])), (8, OrderedDict([('snow', <ocgis.interface.nc.field.NcField object at 0x7fa2710b48d0>)])), (9, OrderedDict([('snow', <ocgis.interface.nc.field.NcField object at 0x7fa2710b5810>)])), (10, OrderedDict([('snow', <ocgis.interface.nc.field.NcField object at 0x7fa2709b7d50>)])), (11, OrderedDict([('snow', <ocgis.interface.nc.field.NcField object at 0x7fa261579d90>)])),
...
or prettier:
for x in ret:
print (x)
for y in ret[x]:
print (y,':',ret[x][y])
...
2080
('snow', ':', <ocgis.interface.nc.field.NcField object at 0x7fa261bb90d0>)
2081
('snow', ':', <ocgis.interface.nc.field.NcField object at 0x7fa261cf7350>)
2082
('snow', ':', <ocgis.interface.nc.field.NcField object at 0x7fa261bd3a50>)
2083
('snow', ':', <ocgis.interface.nc.field.NcField object at 0x7fa261b6f6d0>)
2084
('snow', ':', <ocgis.interface.nc.field.NcField object at 0x7fa261b86110>)
2085
('snow', ':', <ocgis.interface.nc.field.NcField object at 0x7fa261b3ec10>)
2086
('snow', ':', <ocgis.interface.nc.field.NcField object at 0x7fa261b51990>)
2087
('snow', ':', <ocgis.interface.nc.field.NcField object at 0x7fa261a9cd90>)
2088
('snow', ':', <ocgis.interface.nc.field.NcField object at 0x7fa261ac5750>)
2089
('snow', ':', <ocgis.interface.nc.field.NcField object at 0x7fa261a247d0>)
2090
('snow', ':', <ocgis.interface.nc.field.NcField object at 0x7fa261ca9cd0>)
2091
('snow', ':', <ocgis.interface.nc.field.NcField object at 0x7fa261a289d0>)
2092
('snow', ':', <ocgis.interface.nc.field.NcField object at 0x7fa2619ea9d0>)
2093
('snow', ':', <ocgis.interface.nc.field.NcField object at 0x7fa2619a0cd0>)
2094
('snow', ':', <ocgis.interface.nc.field.NcField object at 0x7fa2619c8890>)
2095
('snow', ':', <ocgis.interface.nc.field.NcField object at 0x7fa26196a990>)
2096
('snow', ':', <ocgis.interface.nc.field.NcField object at 0x7fa261982f50>)
2097
('snow', ':', <ocgis.interface.nc.field.NcField object at 0x7fa261924690>)
2098
('snow', ':', <ocgis.interface.nc.field.NcField object at 0x7fa26194c9d0>)
2099
('snow', ':', <ocgis.interface.nc.field.NcField object at 0x7fa2619ad290>)
2100
('snow', ':', <ocgis.interface.nc.field.NcField object at 0x7fa2618dc890>)
2101
('snow', ':', <ocgis.interface.nc.field.NcField object at 0x7fa261905090>)
Stumbling on the next issue when trying to output as a file - either nc or csv or shp-csv (always the same error):
#Return NETCDF File
print('returning netcdf...')
ops = OcgOperations(dataset=rdc, spatial_operation='intersects', aggregate=False, snippet=False, geom='/media/root/Elements/data/snow/shapefiles/gem2015/Gem_AT_2015.shp',
geom_uid='GEM_NR', interpolate_spatial_bounds=True, output_format='nc', geom_select_uid=[31848, 61428, 62014])
ret = ops.execute()
Traceback
Traceback (most recent call last):
File "<ipython-input-38-13c5386fd3f9>", line 4, in <module>
ret = ops.execute()
File "/usr/anaconda27/lib/python2.7/site-packages/ocgis/api/operations.py", line 319, in execute
return interp.execute()
File "/usr/anaconda27/lib/python2.7/site-packages/ocgis/api/interpreter.py", line 136, in execute
shutil.rmtree(outdir)
File "/usr/anaconda27/lib/python2.7/shutil.py", line 256, in rmtree
onerror(os.rmdir, path, sys.exc_info())
File "/usr/anaconda27/lib/python2.7/shutil.py", line 254, in rmtree
os.rmdir(path)
OSError: [Errno 39] Directory not empty: '/media/root/Elements/data/snow/output/ocgis_output'
Same error if I do not set geom_select_uid
The output directory is obviously empty. Setting env.OVERWRITE
does not change a thing.
Thanks again,
ignore_Errors=True
in shutil.py:210 yields this error:
``` html
Traceback (most recent call last):
File "interpolate_spatial_bounds
attrib. And with that we are back at the beginning...
interpolate_spatial_bounds=True
I was able to successfully create a csv (SNIPPET only) right now. Hi Ben, thanks for putting this on the to do list. One further question: Am am running this routine with another shapefile too, and interestingly, fiona crashes when opening the shapefile with an Indexerror.
I have read about this isssue in other posts of yours
Thanks!
A few things I've discovered before moving on to your other questions:
desired_crs_proj4 = '+proj=lcc +lat_1=49 +lat_2=46 +lat_0=47.5 +lon_0=13.33333333333333 +x_0=400000 +y_0=400000 +ellps=bessel +towgs84=577.326,90.129,463.919,5.137,1.474,5.297,2.4232 +units=m +no_defs'
desired_crs = CoordinateReferenceSystem(proj4=desired_crs_proj4)
rd = RequestDataset(uri, crs=desired_crs)
Moving on...
The returned SpatialCollection
/dict
object is lacking documentation. I am sorry about this. A quick tour:
# Keys in the collection are geometry unique identifiers.
ugeom = sc.ugeom[20] # A Shapely geometry.
# You can also access the properties (shapefile attributes) associated with the
# geometry.
properties = sc.properties[20]
# To access the subsetted data:
fields = sc[20]
# Fields are named after the request dataset that created them. Typically this is the
# variable name unless it is provided when creating the request dataset. In your
# case it is "snow".
field = fields['snow']
# Fields have variables.
variable = field.variables['snow']
variable_value = variable.value # A masked NumPy array.
# Fields also contain dimensional information.
field.temporal.value_datetime
field.temporal.value
Let me know if this is not enough to keep you going. I will tag you as wanting better documentation for the low-level interfaces. I want it implemented too! I'm really not sure what's going on with the printing issue. You are able to access the data though, correct?
If possible, can you update your "ocgis" installation to use the "next" branch? (It is stable and tested regularly). I will test your issues with output formats (i.e. CSV) to see if I can reproduce them.
I have read about this isssue in other posts of yours Toblerity/Fiona#255, and since I havel GDAL 2.0 installed as well, I might try to resettle for an older version of GDAL and fiona. Which versions can you recommend to me where GDAL and FIONA cooperate well?
There are a couple solutions. I hear the fiona
master branch has GDAL 2.0 support: https://github.com/Toblerity/Fiona/issues/259#issuecomment-170603832.
If you are using Anaconda, the IOOS channel has good builds for OCGIS-like dependencies: https://anaconda.org/ioos.
Personally, I have been using Fiona with GDAL 1.11.
Did I cover everything for the moment? Let me know if I missed any of your questions!
Thanks, that will keep me going for the moment. Big thanks for your great support! :+1: /M
A couple quick questions if you don't mind. I am noticing some inconsistencies in the handling of your coordinate system during coordinate system updates. I am working on tracking those down. Everything seems good at the ocgis API level, so I am looking into fiona
and osgeo.osr
.
I'll provide you updates as I learn more...
Some other updates/thoughts:
shutil
errors are worrisome, but I have never seen them before in the context of this software. I'm guessing the overwrite issues disappear if you remove the output directories manually.I have had these issues too in QGIS, that the shapefile and netcdf don't match. But they should. QGIS reads the shapefile just fine, but has always had troubles with reading the NETcdf file.
The problem is not that these two don't overlap in QGIS, it is the netcdf being moved a little down from its original projection in QGIS. I do not know why, as to my understanding, the coordinates are set up correctly. I think the problem is QGIS related, but I need to check once more. The netcdf displayed better in Panoply. I have to find a "Europe" overlay for panoply to see, if it firts the country boundaries well there.
I would definitely welcome a "groups" variable in the netcdf output corresponding to the different subsets created with help of the shapefile - however it is no must since OCGIS so far is able to produce everything I need by itself and I do not need to run over the netcdf files by myself anymore.
Btw, is your fiona able to read in that shapefile?? Attached: ski.zip
shpfile = '...'
c = fiona.open(shpfile, 'r')
next(c)
The problem does definitely seem related to QGIS and its netcdf capabilities. I suspect that qgis has problems with the cf conventions for projections. See above the netcdf I have sent you in panoply with an overlay of eu boundaries (shapefile from eurostat, etrs89)
Btw, is your fiona able to read in that shapefile??
Yes, it reads okay. I am using fiona
v1.6.3 from that IOOS Anaconda channel.
$ gdal-config --version
1.11.3
$ python -c "import fiona; print fiona.__version__"
1.6.3
The problem does definitely seem related to QGIS and its netcdf capabilities. I suspect that qgis has problems with the cf conventions for projections. See above the netcdf Is ent you in panoply with an overlay of eu boundaries (shapefile from eurostat, etrs89)
Cool. Thanks for the image. I agree it must be something with QGIS. I will let you know if I learn anything interesting.
Yes, it reads okay. I am using fiona v1.6.3 from that IOOS Anaconda channel.
I changed to your fiona and gdal versions and it would read in fine now. Before it failed with the known index error. This is once more confirmation of the issues between fiona > 1.5 and gdal 2.0.
Fixed: https://github.com/NCPP/ocgis/commit/e99e05dc969522c976cd2b82aa78eb086e25f18b
@michaelradhuber I'm going to close this. Let me know if anything else comes up!
I am trying to subset a NETCDF4 dataset with daily snow cover data by means of an ESRI Shapefile with municipality polygons for Austria. CRS-projections all set and both match well now.
I am following manual here from this link http://ncpp.github.io/ocgis/examples.html#advanced-subsetting and when running this code
print('returning numpy...') ops = OcgOperations(dataset=rdc, spatial_operation='clip', aggregate=True, snippet=SNIPPET, geom='/media/root/Elements/data/snow/shapefiles/gem2015/Gem_AT_2015.shp') ret = ops.execute()
...this error is raised:
ValueError: zero-size array to reduction operation maximum which has no identity
I thought it had to do with NaN values in the netcdf file, however when I changed those values to '0' the error still persisted. I am rather stuck here at the moment and would highly appreciate any help.
Here is the traceback
File "", line 43, in
ret = ops.execute()
File "/usr/anaconda27/lib/python2.7/site-packages/ocgis/api/operations.py", line 319, in execute return interp.execute()
File "/usr/anaconda27/lib/python2.7/site-packages/ocgis/api/interpreter.py", line 125, in execute ret = conv.write()
File "/usr/anaconda27/lib/python2.7/site-packages/ocgis/conv/numpy_.py", line 14, in write for coll in self:
File "/usr/anaconda27/lib/python2.7/site-packages/ocgis/conv/numpy_.py", line 9, in iter for coll in self.colls:
File "/usr/anaconda27/lib/python2.7/site-packages/ocgis/api/subset.py", line 76, in iter for coll in self._itercollections():
File "/usr/anaconda27/lib/python2.7/site-packages/ocgis/api/subset.py", line 128, in _itercollections for coll in self._processsubsettables(rds):
File "/usr/anaconda27/lib/python2.7/site-packages/ocgis/api/subset.py", line 274, in _processsubsettables for coll in self._processgeometries(itr, field, headers, value_keys, alias):
File "/usr/anaconda27/lib/python2.7/site-packages/ocgis/api/subset.py", line 676, in _processgeometries sfield = sfield.get_spatially_aggregated(new_spatial_uid=subset_ugid)
File "/usr/anaconda27/lib/python2.7/site-packages/ocgis/interface/base/field.py", line 364, in get_spatially_aggregated r_value = variable.value
File "/usr/anaconda27/lib/python2.7/site-packages/ocgis/interface/base/variable.py", line 90, in value self._value = self._format_privatevalue(self._getvalue())
File "/usr/anaconda27/lib/python2.7/site-packages/ocgis/interface/base/variable.py", line 323, in _getvalue self._set_value_fromsource()
File "/usr/anaconda27/lib/python2.7/site-packages/ocgis/interface/base/variable.py", line 328, in _set_value_fromsource self._value = self._field._get_value_fromsource(self._data, self.name)
File "/usr/anaconda27/lib/python2.7/site-packages/ocgis/interface/nc/field.py", line 45, in _get_value_fromsource raw = ds.variables[variable_name].getitem(slc)
File "netCDF4.pyx", line 3042, in netCDF4.Variable.getitem (netCDF4.c:41837)
File "/usr/anaconda27/lib/python2.7/site-packages/netCDF4_utils.py", line 205, in _StartCountStride if ea.max()+1 > elen:
File "/usr/anaconda27/lib/python2.7/site-packages/numpy/core/_methods.py", line 26, in _amax return umr_maximum(a, axis, None, out, keepdims)
ValueError: zero-size array to reduction operation maximum which has no identity
DEBUG:
(Pdb) rdc = RequestDatasetCollection([RequestDataset(os.path.join(DATA_DIR, uri), var) for uri, var in NCS.iteritems()]) (Pdb) print('returning numpy...') returning numpy... (Pdb) ops = OcgOperations(dataset=rdc, spatial_operation='clip', aggregate=True, snippet=SNIPPET, geom='/media/root/Elements/data/snow/shapefiles/gem2015/Gem_AT_2015.shp') (Pdb) ret = ops.execute() *\ ValueError: zero-size array to reduction operation maximum which has no identity (Pdb)
I am not sure where the error is but I very much suspect a bug out here someway. Any answer as to how to prevent it would be highly appreciated too!
Thanks,
Michael