NCPP / ocgis

OpenClimateGIS is a set of geoprocessing and calculation tools for CF-compliant climate datasets.
Other
70 stars 21 forks source link

Unicode characters #496

Closed aaschwanden closed 5 years ago

aaschwanden commented 5 years ago

The error below maybe be related to issue #446, but it occurs in a different part of the code.

I'm extracting a subdomain from a netCDF file. I don't think the netCDF meta data uses unicode, but I know that an attribute in the shape file that is used to cut out the subdomain uses unicode characters.

I can provide a minimum example netCDF and SHP file to reproduce the error.

Extracting glacier Upernavik Isstrøm C with UGID 214 /home/aaschwanden/.local/lib/python2.7/site-packages/ocgis-2.1.1-py2.7.egg/ocgis/variable/attributes.py:60: OcgWarning: UnicodeError encountered when converting the value of attribute with name 'frontal_melt.routing.parameter_a_units' to a string. Sending the value to the netCDF API warn(OcgWarning(msg)) /home/aaschwanden/.local/lib/python2.7/site-packages/ocgis-2.1.1-py2.7.egg/ocgis/variable/attributes.py:60: OcgWarning: UnicodeError encountered when converting the value of attribute with name 'frontal_melt.routing.parameter_b_units' to a string. Sending the value to the netCDF API warn(OcgWarning(msg)) Traceback (most recent call last): File "/home/aaschwanden/base/gris-analysis/basins/extract_glacier.py", line 154, in <module> extract_glacier_ugid(gl_name, ugid) File "/home/aaschwanden/base/gris-analysis/basins/extract_glacier.py", line 90, in extract_glacier_ugid ret = ops.execute() File "/home/aaschwanden/.local/lib/python2.7/site-packages/ocgis-2.1.1-py2.7.egg/ocgis/ops/core.py", line 313, in execute return interp.execute() File "/home/aaschwanden/.local/lib/python2.7/site-packages/ocgis-2.1.1-py2.7.egg/ocgis/ops/interpreter.py", line 134, in execute ret = conv.write() File "/home/aaschwanden/.local/lib/python2.7/site-packages/ocgis-2.1.1-py2.7.egg/ocgis/conv/base.py", line 229, in write _write_source_meta_(path, ops) File "/home/aaschwanden/.local/lib/python2.7/site-packages/ocgis-2.1.1-py2.7.egg/ocgis/conv/base.py", line 345, in _write_source_meta_ metadata = element.driver.get_dump_report() File "/home/aaschwanden/.local/lib/python2.7/site-packages/ocgis-2.1.1-py2.7.egg/ocgis/driver/base.py", line 395, in get_dump_report lines += get_dump_report_for_group(group_metadata, global_attributes_name=global_attributes_name, indent=indent) File "/home/aaschwanden/.local/lib/python2.7/site-packages/ocgis-2.1.1-py2.7.egg/ocgis/driver/base.py", line 1025, in get_dump_report_for_group lines.append(attr_template.format(key, key2, format_attribute_for_dump_report(value2))) File "/home/aaschwanden/.local/lib/python2.7/site-packages/ocgis-2.1.1-py2.7.egg/ocgis/driver/base.py", line 990, in format_attribute_for_dump_report ret = '"{}"'.format(attr_value) UnicodeEncodeError: 'ascii' codec can't encode character u'\u2212' in position 1: ordinal not in range(128)

bekozi commented 5 years ago

Thanks for reporting @aaschwanden. Do you mind pulling branch i496-unicode-error to see if the problem is fixed? Like you noticed, this is in a different part of the code that has caught unicode stuff before.

In the new branch code, the formatter will now just emit a warning and convert the value to the empty string when this error is encountered. If this has negative effects on your metadata, we may have to look into this a bit further!

aaschwanden commented 5 years ago

@bekozi : This fixes the error for me. Thanks. Whether it has an adverse effect on my meta data I can't tell yet, but it's unlikely.

In my own scripts, I sometimes use the python module "unidecode" to convert unicode to ascii, as I want to avoid filenames with unicode characters.

Thanks a bunch.

bekozi commented 5 years ago

Excellent! This fix is now in master. Thanks for the tip about unidecode. Using it could make sense if these unicode issues keep popping up.