NCPP / ocgis

OpenClimateGIS is a set of geoprocessing and calculation tools for CF-compliant climate datasets.
Other
70 stars 21 forks source link

Possible BUG - ValueError: zero-size array to reduction operation maximum which has no identity #405

Closed michaelradhuber closed 8 years ago

michaelradhuber commented 8 years ago

I am trying to subset a NETCDF4 dataset with daily snow cover data by means of an ESRI Shapefile with municipality polygons for Austria. CRS-projections all set and both match well now.

I am following manual here from this link http://ncpp.github.io/ocgis/examples.html#advanced-subsetting and when running this code

print('returning numpy...') ops = OcgOperations(dataset=rdc, spatial_operation='clip', aggregate=True, snippet=SNIPPET, geom='/media/root/Elements/data/snow/shapefiles/gem2015/Gem_AT_2015.shp') ret = ops.execute()

...this error is raised:

ValueError: zero-size array to reduction operation maximum which has no identity

I thought it had to do with NaN values in the netcdf file, however when I changed those values to '0' the error still persisted. I am rather stuck here at the moment and would highly appreciate any help.

Here is the traceback

File "", line 43, in ret = ops.execute()

File "/usr/anaconda27/lib/python2.7/site-packages/ocgis/api/operations.py", line 319, in execute return interp.execute()

File "/usr/anaconda27/lib/python2.7/site-packages/ocgis/api/interpreter.py", line 125, in execute ret = conv.write()

File "/usr/anaconda27/lib/python2.7/site-packages/ocgis/conv/numpy_.py", line 14, in write for coll in self:

File "/usr/anaconda27/lib/python2.7/site-packages/ocgis/conv/numpy_.py", line 9, in iter for coll in self.colls:

File "/usr/anaconda27/lib/python2.7/site-packages/ocgis/api/subset.py", line 76, in iter for coll in self._itercollections():

File "/usr/anaconda27/lib/python2.7/site-packages/ocgis/api/subset.py", line 128, in _itercollections for coll in self._processsubsettables(rds):

File "/usr/anaconda27/lib/python2.7/site-packages/ocgis/api/subset.py", line 274, in _processsubsettables for coll in self._processgeometries(itr, field, headers, value_keys, alias):

File "/usr/anaconda27/lib/python2.7/site-packages/ocgis/api/subset.py", line 676, in _processgeometries sfield = sfield.get_spatially_aggregated(new_spatial_uid=subset_ugid)

File "/usr/anaconda27/lib/python2.7/site-packages/ocgis/interface/base/field.py", line 364, in get_spatially_aggregated r_value = variable.value

File "/usr/anaconda27/lib/python2.7/site-packages/ocgis/interface/base/variable.py", line 90, in value self._value = self._format_privatevalue(self._getvalue())

File "/usr/anaconda27/lib/python2.7/site-packages/ocgis/interface/base/variable.py", line 323, in _getvalue self._set_value_fromsource()

File "/usr/anaconda27/lib/python2.7/site-packages/ocgis/interface/base/variable.py", line 328, in _set_value_fromsource self._value = self._field._get_value_fromsource(self._data, self.name)

File "/usr/anaconda27/lib/python2.7/site-packages/ocgis/interface/nc/field.py", line 45, in _get_value_fromsource raw = ds.variables[variable_name].getitem(slc)

File "netCDF4.pyx", line 3042, in netCDF4.Variable.getitem (netCDF4.c:41837)

File "/usr/anaconda27/lib/python2.7/site-packages/netCDF4_utils.py", line 205, in _StartCountStride if ea.max()+1 > elen:

File "/usr/anaconda27/lib/python2.7/site-packages/numpy/core/_methods.py", line 26, in _amax return umr_maximum(a, axis, None, out, keepdims)

ValueError: zero-size array to reduction operation maximum which has no identity

DEBUG: (Pdb) rdc = RequestDatasetCollection([RequestDataset(os.path.join(DATA_DIR, uri), var) for uri, var in NCS.iteritems()]) (Pdb) print('returning numpy...') returning numpy... (Pdb) ops = OcgOperations(dataset=rdc, spatial_operation='clip', aggregate=True, snippet=SNIPPET, geom='/media/root/Elements/data/snow/shapefiles/gem2015/Gem_AT_2015.shp') (Pdb) ret = ops.execute() *\ ValueError: zero-size array to reduction operation maximum which has no identity (Pdb)

I am not sure where the error is but I very much suspect a bug out here someway. Any answer as to how to prevent it would be highly appreciated too!

Thanks,

Michael

bekozi commented 8 years ago

Your suspicions could be correct! Sorry you are having trouble.

To start, does your script work for a single geometry? I'm not sure what your shapefile schema looks like, but could you try adding a geometry selection to the operations call?:

ops = OcgOperations(..., geom_select_uid=[<integer identifier>], 
                         geom_uid=<string name of shapefile unique identifier>)

I would also be curious if an intersects operation returns the same error:

ops = OcgOperations(..., spatial_operation='intersects', aggregate=False)

While we're at it, could you send the metadata of one of your target netCDF files?

RequestDataset(uri, var).inspect()

Thanks!

michaelradhuber commented 8 years ago

Hi Ben! Thanks for the fast answer. I will check into 1) and 2) right away, meanwhile here are the metadata (I rewrote the NETCDF myself because the official one was not CF compliant - but I kept the globa attributes in there as they came, so the metadata became a bit longer).

Thanks again as I am completely stuck with this error and just don't see through it at all...

URI = snowdepth1999.nc VARIABLE = snow

=== Temporal =============

Name = time Count = 365 Has Bounds = False Data Type = float64 Start Date = 1999-01-01 00:00:00 End Date = 1999-12-31 00:00:00 Calendar = 365_day Units = days since 1970-12-31 00:00:00 Resolution (Days) = 1

=== Spatial ==============

Spatial Reference = CFLambertConformal Proj4 String = +proj=lcc +lat_1=49 +lat_2=46 +lat_0=48 +lon_0=13.33 +x_0=400000 +y_0=400000 +ellps=WGS84 +units=m +no_defs Extent = (100450.0, 194912.0, 701450.0, 545912.0) Geometry Interface = SpatialGeometryPointDimension Resolution = 1002.26190476 Count = 210951

=== Level ================

No level dimension.

=== Metadata Dump ========

dimensions: time = ISUNLIMITED ; // 365 currently lon = 601 ; lat = 351 ;

variables: float64 time(time) ; time:units = "days since 1970-12-31 00:00:00" ; time:calendar = "365_day" ; float64 lon(lon) ; lon:units = "m" ; lon:long_name = "Longitude coordinates" ; lon:standard_name = "projection_x_coordinate" ; float64 lat(lat) ; lat:units = "m" ; lat:long_name = "Latitute coordinates" ; lat:standard_name = "projection_y_coordinate" ; float64 snow(time, lat, lon) ; snow:units = "cm" ; snow:missing_value = "9.96920996839e+36" ; snow:long_name = "Daily snow depth" ; snow:coordinates = "lat lon" ; snow:grid_mapping = "Lambert_Conformal" ; int32 Lambert_Conformal() ; Lambert_Conformal:grid_mapping_name = "lambert_conformal_conic" ; Lambert_Conformal:linear_unit = "Meters" ; Lambert_Conformal:false_easting = "400000" ; Lambert_Conformal:false_northing = "400000" ; Lambert_Conformal:longitude_of_central_meridian = "13.33" ; Lambert_Conformal:standard_parallel = "[49 46]" ; Lambert_Conformal:latitude_of_projection_origin = "48" ;

// global attributes: :title = STARTCLIM Gridmode daily snow depths, 1971-2006 ; :source = STARTCLIM station data ; :description = Date-reconverted dataset from ZAMG snow-depth data ; :history = Created datetime.datetime(2016, 1, 8, 19, 57, 46, 270059) ; :author = Michael Radhuber ; :institution = Johannes Kepler University, Dept. of Economics ; :contact = michael.radhuber@jku.at ; :references = www.econ.jku.at/radhuber ; :projection = Clarke 1866 Lambert Conformal Conic ; :projection_parameters_Linear_Unit = Meters ; :projection_parameters_False_Easting = 400000 ; :projection_parameters_False_Northing = 400000 ; :projectwikiion_parameters_Central_Meridian = 13.33 ; :projection_parameters_Standard_Parallel_1 = 46 ; :projection_parameters_Standard_Parallel_2 = 49 ; :projection_parameters_Latitude_of_Origin = 48 ; :Conventions = CF-1.0 ; :conventionsURL = http://www.unidata.ucar.edu/packages/netcdf/conventions.html ; :spatial_ref = PROJCS["MGI / Austria Lambert", GEOGCS["MGI", DATUM["Militar_Geographische_Institute", SPHEROID["Bessel 1841",6377397.155,299.1528128, AUTHORITY["EPSG","7004"]], TOWGS84[577.326,90.129,463.919,5.137,1.474,5.297,2.4232], AUTHORITY["EPSG","6312"]], PRIMEM["Greenwich",0, AUTHORITY["EPSG","8901"]], UNIT["degree",0.0174532925199433, AUTHORITY["EPSG","9108"]], AUTHORITY["EPSG","4312"]], UNIT["metre",1, AUTHORITY["EPSG","9001"]], PROJECTION["Lambert_Conformal_Conic_2SP"], PARAMETER["standard_parallel_1",49], PARAMETER["standard_parallel_2",46], PARAMETER["latitude_of_origin",47.5], PARAMETER["central_meridian",13.33333333333333], PARAMETER["false_easting",400000], PARAMETER["false_northing",400000], AUTHORITY["EPSG","31287"], AXIS["Y",EAST], AXIS["X",NORTH]] ;

Edit #1:
Here are the shapefiles (scroll down and click the blue button to DL https://www.data.gv.at/katalog/dataset/c33d36b0-f184-4f2a-89cc-839ca7fcf88a
Edit #2:
Surprise, surprise..., this one worked without errors or warnings: print('returning numpy for a municipality...') ops = OcgOperations(dataset=rdc, spatial_operation='clip', aggregate=True, snippet=SNIPPET, geom='/media/root/Elements/data/snow/shapefiles/gem2015/Gem_AT_2015.shp', geom_select_uid=[61428], geom_uid='GEM_NR') ret = ops.execute()
Edit #3:
Unfortunately the variant with spatial_operation='intersects' on all municipalities reproduced the same error.
bekozi commented 8 years ago

Okay, good, inspection output looks :+1:. Your CF formatting also looks good. I noticed "axis" attributes are missing from the coordinate variables. Ideally, coordinate variables would have:

float64 lat(lat):
 lat:axis = "Y";
 ...

float64 lat(lon):
 lat:axis = "X";
 ...

I don't think this is causing the issue, but I at least wanted to get it out there :smile:. Sorry to nitpick...

I suspect we may have trouble debugging this through the issue tracker. Is your shapefile a reasonable size? If so, it may make sense to mail the shapefile and a snippet of a netCDF to ocgis_support@list.woc.noaa.gov. You can get a netCDF snippet with:

rd = ocgis.RequestDataset(uri, variable)
# First time slice.
ops = ocgis.OcgOperations(dataset=rd, snippet=True, output_format='nc')
# The first 10 time slices would probably be okay too.
ops = ocgis.OcgOperations(dataset=rd, slice=[None, [0, 10], None, None, None], output_format='nc')
ret = ops.execute()

You can also point me towards download URLs if that works better!

michaelradhuber commented 8 years ago

Shapefiles: Find them above in the edits of the first post I added a little later on. netCDF snippet: attached ocgis_output.zip

Thanks for the axis metadata, I already added them :+1:

bekozi commented 8 years ago

Thanks! I missed the edits. Unfortunately, I am not able to download the shapefile:

screenshot from 2016-01-08 14 24 36 I'll try again later. I sent the webmaster a quick message. I was able to get the netCDF data though! I had totally forgotten about issue attachments.

From your experimentation, I think there is a municipality boundary that does not overlap any of the climate data "points" in the target netCDF. If you are okay with bounds interpolation/extrapolation to create cells/polygons, you could try setting interpolate_spatial_bounds=True (http://ncpp.github.io/ocgis/api.html#interpolate-spatial-bounds).

Regardless, these empty subsets are supposed to raise an EmptySubsetError so I think you hit upon a bug. I'd like to take a closer look at the shapefile to make sure and also to validate that coordinate system transformations are working as expected. Getting closer!

michaelradhuber commented 8 years ago

Sorry about that, try again to download the shapefiles directly here: http://ckan.data.ktn.gv.at/storage/f/2015-06-30T08%3A06%3A45.158Z/gem-gr-at-15-18.zip I'll upload them here too.

bekozi commented 8 years ago

Got it! Thanks. Will be in touch.

michaelradhuber commented 8 years ago

This ran without errors only I am unable to find the output it should have produced....? Shouldn't it produce another variable or dictionary object?

print('returning numpy...') ops = OcgOperations(dataset=rdc, spatial_operation='clip', aggregate=True, snippet=SNIPPET, geom='/media/root/Elements/data/snow/shapefiles/gem2015/Gem_AT_2015.shp', interpolate_spatial_bounds=True) ret = ops.execute()

bekozi commented 8 years ago

Yes, the variable ret should be a dictionary-like SpatialCollection object. Is ret empty?

michaelradhuber commented 8 years ago

Actually a 'ret' dictionary object is produced (it doesn't show up in the variable explorer, so I didn't see it at once). But I can't interpret its contents. It contains the variable name, the date, and another object field that I don't understand. Shouldn't that third field be some kind of array containing a new municipality ID (from the shapefile) for every data point of snow given lat, lon and time? Why isn't it printing correctly? Or am I getting it wrong and the array is in there, but masked? (Then of course, how do I make use of it??)

print('returning numpy...') ops = OcgOperations(dataset=rdc, spatial_operation='clip', aggregate=True, snippet=SNIPPET, geom='/media/root/Elements/data/snow/shapefiles/gem2015/Gem_AT_2015.shp', interpolate_spatial_bounds=True) ret = ops.execute() returning numpy...

Then if I do print(ret):

print(ret)
SpatialCollection([(1, OrderedDict([('snow', <ocgis.interface.nc.field.NcField object at 0x7fa270894110>)])), (2, OrderedDict([('snow', <ocgis.interface.nc.field.NcField object at 0x7fa290c92c50>)])), (3, OrderedDict([('snow', <ocgis.interface.nc.field.NcField object at 0x7fa261579a90>)])), (4, OrderedDict([('snow', <ocgis.interface.nc.field.NcField object at 0x7fa2615bef90>)])), (5, OrderedDict([('snow', <ocgis.interface.nc.field.NcField object at 0x7fa2615c4490>)])), (6, OrderedDict([('snow', <ocgis.interface.nc.field.NcField object at 0x7fa2709c7b10>)])), (7, OrderedDict([('snow', <ocgis.interface.nc.field.NcField object at 0x7fa2709a5e50>)])), (8, OrderedDict([('snow', <ocgis.interface.nc.field.NcField object at 0x7fa2710b48d0>)])), (9, OrderedDict([('snow', <ocgis.interface.nc.field.NcField object at 0x7fa2710b5810>)])), (10, OrderedDict([('snow', <ocgis.interface.nc.field.NcField object at 0x7fa2709b7d50>)])), (11, OrderedDict([('snow', <ocgis.interface.nc.field.NcField object at 0x7fa261579d90>)])),
...

or prettier:

for x in ret: print (x) for y in ret[x]: print (y,':',ret[x][y])

...
2080
('snow', ':', <ocgis.interface.nc.field.NcField object at 0x7fa261bb90d0>)
2081
('snow', ':', <ocgis.interface.nc.field.NcField object at 0x7fa261cf7350>)
2082
('snow', ':', <ocgis.interface.nc.field.NcField object at 0x7fa261bd3a50>)
2083
('snow', ':', <ocgis.interface.nc.field.NcField object at 0x7fa261b6f6d0>)
2084
('snow', ':', <ocgis.interface.nc.field.NcField object at 0x7fa261b86110>)
2085
('snow', ':', <ocgis.interface.nc.field.NcField object at 0x7fa261b3ec10>)
2086
('snow', ':', <ocgis.interface.nc.field.NcField object at 0x7fa261b51990>)
2087
('snow', ':', <ocgis.interface.nc.field.NcField object at 0x7fa261a9cd90>)
2088
('snow', ':', <ocgis.interface.nc.field.NcField object at 0x7fa261ac5750>)
2089
('snow', ':', <ocgis.interface.nc.field.NcField object at 0x7fa261a247d0>)
2090
('snow', ':', <ocgis.interface.nc.field.NcField object at 0x7fa261ca9cd0>)
2091
('snow', ':', <ocgis.interface.nc.field.NcField object at 0x7fa261a289d0>)
2092
('snow', ':', <ocgis.interface.nc.field.NcField object at 0x7fa2619ea9d0>)
2093
('snow', ':', <ocgis.interface.nc.field.NcField object at 0x7fa2619a0cd0>)
2094
('snow', ':', <ocgis.interface.nc.field.NcField object at 0x7fa2619c8890>)
2095
('snow', ':', <ocgis.interface.nc.field.NcField object at 0x7fa26196a990>)
2096
('snow', ':', <ocgis.interface.nc.field.NcField object at 0x7fa261982f50>)
2097
('snow', ':', <ocgis.interface.nc.field.NcField object at 0x7fa261924690>)
2098
('snow', ':', <ocgis.interface.nc.field.NcField object at 0x7fa26194c9d0>)
2099
('snow', ':', <ocgis.interface.nc.field.NcField object at 0x7fa2619ad290>)
2100
('snow', ':', <ocgis.interface.nc.field.NcField object at 0x7fa2618dc890>)
2101
('snow', ':', <ocgis.interface.nc.field.NcField object at 0x7fa261905090>)
michaelradhuber commented 8 years ago

Stumbling on the next issue when trying to output as a file - either nc or csv or shp-csv (always the same error):

#Return NETCDF File
print('returning netcdf...')
ops = OcgOperations(dataset=rdc, spatial_operation='intersects', aggregate=False, snippet=False, geom='/media/root/Elements/data/snow/shapefiles/gem2015/Gem_AT_2015.shp', 
                    geom_uid='GEM_NR', interpolate_spatial_bounds=True, output_format='nc', geom_select_uid=[31848, 61428, 62014])
ret = ops.execute()

Traceback

Traceback (most recent call last):

  File "<ipython-input-38-13c5386fd3f9>", line 4, in <module>
    ret = ops.execute()

  File "/usr/anaconda27/lib/python2.7/site-packages/ocgis/api/operations.py", line 319, in execute
    return interp.execute()

  File "/usr/anaconda27/lib/python2.7/site-packages/ocgis/api/interpreter.py", line 136, in execute
    shutil.rmtree(outdir)

  File "/usr/anaconda27/lib/python2.7/shutil.py", line 256, in rmtree
    onerror(os.rmdir, path, sys.exc_info())

  File "/usr/anaconda27/lib/python2.7/shutil.py", line 254, in rmtree
    os.rmdir(path)

OSError: [Errno 39] Directory not empty: '/media/root/Elements/data/snow/output/ocgis_output'

Same error if I do not set geom_select_uid The output directory is obviously empty. Setting env.OVERWRITE does not change a thing.

Thanks again,

Edit:
This error seems to appear right before the operations finishes. I can watch the csv file build up successfully, but at the end it crashes with this error and everything is deleted...
Edit 2:
Setting ignore_Errors=True in shutil.py:210 yields this error: ``` html Traceback (most recent call last): File "", line 4, in ret = ops.execute() File "/usr/anaconda27/lib/python2.7/site-packages/ocgis/api/operations.py", line 319, in execute return interp.execute() File "/usr/anaconda27/lib/python2.7/site-packages/ocgis/api/interpreter.py", line 125, in execute ret = conv.write() File "/usr/anaconda27/lib/python2.7/site-packages/ocgis/conv/base.py", line 226, in write self._write_coll_(f, coll) File "/usr/anaconda27/lib/python2.7/site-packages/ocgis/conv/nc.py", line 145, in _write_coll_ self._write_archetype_(arch, ds, is_file_only) File "/usr/anaconda27/lib/python2.7/site-packages/ocgis/conv/nc.py", line 122, in _write_archetype_ arch.write_to_netcdf_dataset(dataset, file_only=is_file_only) File "/usr/anaconda27/lib/python2.7/site-packages/ocgis/interface/base/field.py", line 559, in write_to_netcdf_dataset self.spatial.grid.write_to_netcdf_dataset(dataset, **kwargs) File "/usr/anaconda27/lib/python2.7/site-packages/ocgis/interface/base/dimension/spatial.py", line 946, in write_to_netcdf_dataset self.row.write_to_netcdf_dataset(dataset, **kwargs) File "/usr/anaconda27/lib/python2.7/site-packages/ocgis/interface/base/dimension/base.py", line 442, in write_to_netcdf_dataset dataset.createDimension(bounds_dimension_name, size=2) File "netCDF4.pyx", line 1627, in netCDF4.Dataset.createDimension (netCDF4.c:24292) File "netCDF4.pyx", line 2195, in netCDF4.Dimension.__init__ (netCDF4.c:31380) TypeError: expected string or Unicode object, NoneType found ```
Edit 3:
OK, so this issue is somehow related to the interpolate_spatial_bounds attrib. And with that we are back at the beginning...
Edit 4:
I thought I had found the error and traced it back to duplicate records in the UGID variable, however as it turned out, that was not it - there are no duplicates anymore now and the error persists...
Edit 5:
With interpolate_spatial_bounds=True I was able to successfully create a csv (SNIPPET only) right now.
michaelradhuber commented 8 years ago

Hi Ben, thanks for putting this on the to do list. One further question: Am am running this routine with another shapefile too, and interestingly, fiona crashes when opening the shapefile with an Indexerror.

I have read about this isssue in other posts of yours https://github.com/Toblerity/Fiona/issues/255, and since I havel GDAL 2.0 installed as well, I might try to resettle for an older version of GDAL and fiona. Which versions can you recommend to me where GDAL and FIONA cooperate well?

Thanks!

bekozi commented 8 years ago

A few things I've discovered before moving on to your other questions:

  1. The CF method for defining coordinate systems does not have a way to set the datum. You can overload the CRS using this method (I extracted the PROJ4 from the spatial_ref global attribute):
desired_crs_proj4 = '+proj=lcc +lat_1=49 +lat_2=46 +lat_0=47.5 +lon_0=13.33333333333333 +x_0=400000 +y_0=400000 +ellps=bessel +towgs84=577.326,90.129,463.919,5.137,1.474,5.297,2.4232 +units=m +no_defs'
desired_crs = CoordinateReferenceSystem(proj4=desired_crs_proj4)
rd = RequestDataset(uri, crs=desired_crs)
  1. There is a very small municipality (OBJECTID=1137) that does not overlap any points. Hence, with bounds interpolation there is an overlap. I fixed the issue that was causing the "zero-size array" error.
  2. I was also going to say that "GEM_NR" is not unique. I agree it would be good to check uniqueness, but typically all geometries are not loaded at the same time. I will add a ticket for checking this when possible. Thanks for the recommendation.

Moving on...

The returned SpatialCollection/dict object is lacking documentation. I am sorry about this. A quick tour:

# Keys in the collection are geometry unique identifiers.
ugeom = sc.ugeom[20] # A Shapely geometry.
# You can also access the properties (shapefile attributes) associated with the 
# geometry.
properties = sc.properties[20]
# To access the subsetted data:
fields = sc[20]
# Fields are named after the request dataset that created them. Typically this is the
# variable name unless it is provided when creating the request dataset. In your
# case it is "snow".
field = fields['snow']
# Fields have variables.
variable = field.variables['snow']
variable_value = variable.value # A masked NumPy array.
# Fields also contain dimensional information.
field.temporal.value_datetime
field.temporal.value

Let me know if this is not enough to keep you going. I will tag you as wanting better documentation for the low-level interfaces. I want it implemented too! I'm really not sure what's going on with the printing issue. You are able to access the data though, correct?

If possible, can you update your "ocgis" installation to use the "next" branch? (It is stable and tested regularly). I will test your issues with output formats (i.e. CSV) to see if I can reproduce them.

I have read about this isssue in other posts of yours Toblerity/Fiona#255, and since I havel GDAL 2.0 installed as well, I might try to resettle for an older version of GDAL and fiona. Which versions can you recommend to me where GDAL and FIONA cooperate well?

There are a couple solutions. I hear the fiona master branch has GDAL 2.0 support: https://github.com/Toblerity/Fiona/issues/259#issuecomment-170603832.

If you are using Anaconda, the IOOS channel has good builds for OCGIS-like dependencies: https://anaconda.org/ioos.

Personally, I have been using Fiona with GDAL 1.11.

Did I cover everything for the moment? Let me know if I missed any of your questions!

michaelradhuber commented 8 years ago

Thanks, that will keep me going for the moment. Big thanks for your great support! :+1: /M

bekozi commented 8 years ago

A couple quick questions if you don't mind. I am noticing some inconsistencies in the handling of your coordinate system during coordinate system updates. I am working on tracking those down. Everything seems good at the ocgis API level, so I am looking into fiona and osgeo.osr.

  1. Do you expect the snow data to completely overlap the municipalities? On a quick QGIS visualization, some municipalities in the north are not covered. It also appears the maximum y-dimension extent on the netCDF does not extend to the northern edge of the municipalities. screenshot from 2016-01-11 09 48 35
  2. Ideally, do the "snow.nc" and municipalities shapefile have the exact same coordinate system?

I'll provide you updates as I learn more...

Some other updates/thoughts:

  1. Subsetting to a netCDF with multiple geometries causes the subset geometries to be unioned (i.e. agg_selection=True, http://ncpp.github.io/ocgis/api.html?highlight=agg_selection#agg-selection). Groups in netCDF files seems like a good way to handle multiple geometries, but definitely not implemented yet.
  2. I was able to write an entire CSV file with no errors. The shutil errors are worrisome, but I have never seen them before in the context of this software. I'm guessing the overwrite issues disappear if you remove the output directories manually.
michaelradhuber commented 8 years ago

I have had these issues too in QGIS, that the shapefile and netcdf don't match. But they should. QGIS reads the shapefile just fine, but has always had troubles with reading the NETcdf file.

The problem is not that these two don't overlap in QGIS, it is the netcdf being moved a little down from its original projection in QGIS. I do not know why, as to my understanding, the coordinates are set up correctly. I think the problem is QGIS related, but I need to check once more. The netcdf displayed better in Panoply. I have to find a "Europe" overlay for panoply to see, if it firts the country boundaries well there.

I would definitely welcome a "groups" variable in the netcdf output corresponding to the different subsets created with help of the shapefile - however it is no must since OCGIS so far is able to produce everything I need by itself and I do not need to run over the netcdf files by myself anymore.

michaelradhuber commented 8 years ago

Btw, is your fiona able to read in that shapefile?? Attached: ski.zip

shpfile = '...'
c = fiona.open(shpfile, 'r')
next(c)
michaelradhuber commented 8 years ago

panoply-netcdf-shapefile-overlay-adm-boundaries-etrs89-source-eurostat The problem does definitely seem related to QGIS and its netcdf capabilities. I suspect that qgis has problems with the cf conventions for projections. See above the netcdf I have sent you in panoply with an overlay of eu boundaries (shapefile from eurostat, etrs89)

bekozi commented 8 years ago

Btw, is your fiona able to read in that shapefile??

Yes, it reads okay. I am using fiona v1.6.3 from that IOOS Anaconda channel.

$ gdal-config --version
1.11.3
$ python -c "import fiona; print fiona.__version__"
1.6.3

The problem does definitely seem related to QGIS and its netcdf capabilities. I suspect that qgis has problems with the cf conventions for projections. See above the netcdf Is ent you in panoply with an overlay of eu boundaries (shapefile from eurostat, etrs89)

Cool. Thanks for the image. I agree it must be something with QGIS. I will let you know if I learn anything interesting.

michaelradhuber commented 8 years ago

Yes, it reads okay. I am using fiona v1.6.3 from that IOOS Anaconda channel.

I changed to your fiona and gdal versions and it would read in fine now. Before it failed with the known index error. This is once more confirmation of the issues between fiona > 1.5 and gdal 2.0.

bekozi commented 8 years ago

Fixed: https://github.com/NCPP/ocgis/commit/e99e05dc969522c976cd2b82aa78eb086e25f18b

@michaelradhuber I'm going to close this. Let me know if anything else comes up!