ioos / compliance-checker

Python tool to check your datasets against compliance standards
http://ioos.github.io/compliance-checker/
Apache License 2.0
108 stars 58 forks source link

UnicodeEncodeError: 'ascii' with cf check #108

Closed lbesnard closed 8 years ago

lbesnard commented 9 years ago

G'Day, I'm taking over @mhidas for a couple of weeks.

When running the ncchecker with -t=cf on this specific file :

curl -l http://data.aodn.org.au/IMOS/opendap/eMII/checker_test/SOOP/CO2/IMOS_SOOP-CO2_GST_20130113T002246Z_VNAA_FV01.nc -o IMOS_SOOP-BA_AE_20130623T065936Z_VLHJ_FV02_Southern-Surveyor-EK60-38-120_END-20130625T002210Z_C-20140815T061308Z.nc

I get the following output:


Running Compliance Checker on the dataset from: IMOS_SOOP-BA_AE_20130623T065936Z_VLHJ_FV02_Southern-Surveyor-EK60-38-120_END-20130625T002210Z_C-20140815T061308Z.nc
Traceback (most recent call last):
  File "/opt/compliance-checker/cchecker.py", line 27, in <module>
    sys.exit(main())
  File "/opt/compliance-checker/cchecker.py", line 21, in main
    args.criteria)
  File "/opt/compliance-checker/compliance_checker/runner.py", line 43, in run_checker
    score_groups = cs.run(ds, *checker_names)
  File "/opt/compliance-checker/compliance_checker/suite.py", line 84, in run
    dsp                = checker.load_datapair(ds)
  File "/opt/compliance-checker/compliance_checker/base.py", line 77, in load_datapair
    data_object = NetCDFDogma('ds', self.beliefs(), ds, namespaces=namespaces)
  File "/usr/local/lib/python2.7/dist-packages/wicken/dogma.py", line 136, in __call__
    obj = super(MetaReligion, clsType).__call__(religion, beliefs, *args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/wicken/netcdf_dogma.py", line 45, in __init__
    root = parse_nc_dataset_as_etree(dataObject)
  File "/usr/local/lib/python2.7/dist-packages/petulantbear/netcdf_etree.py", line 443, in parse_nc_dataset_as_etree
    dataset2ncml_buffer(dataset,output)
  File "/usr/local/lib/python2.7/dist-packages/petulantbear/netcdf2ncml.py", line 241, in dataset2ncml_buffer
    parse_var(output, var, indent)
  File "/usr/local/lib/python2.7/dist-packages/petulantbear/netcdf2ncml.py", line 179, in parse_var
    parse_att(output,(attname,var.getncattr(attname)), new_indent)
  File "/usr/local/lib/python2.7/dist-packages/petulantbear/netcdf2ncml.py", line 124, in parse_att
    attvalue=sanatize(att[1])
UnicodeEncodeError: 'ascii' codec can't encode character u'\ufffd' in position 0: ordinal not in range(128)

I'm using the latest commit of the master branch https://github.com/aodn/compliance-checker/commit/99b30ddb398e24991a5ba79dae8bc65771acdf71

The checker works fine with other files (which works fine with ncdump).

thanks

daf commented 9 years ago

Hi @lbesnard, it looks like this may be fixed already in a dependent library, ioos/petulant-bear#4, which I've just noticed has not had a release with that fix in it. That'll happen soon I suppose.

In the mean time, try updating your petulant-bear library to use the latest master and see if that gets you further.

Activate your environment and pip install --upgrade git+git://github.com/ioos/petulant-bear.git - this command may need tweaking.

lbesnard commented 9 years ago

Thanks @daf . @danfruehauf do you think you could update this on the testing env ?

lukecampbell commented 9 years ago

I'll be publishing a new version of petulant bear next week.

daf commented 9 years ago

@lukecampbell poke!

lukecampbell commented 9 years ago

https://github.com/ioos/petulant-bear/releases/tag/v0.1.3

mhidas commented 9 years ago

Thanks @lukecampbell , that seems to fix the encoding problems we've been seeing.

Should petulant-bear v0.1.3 be added to the requirements.txt file, or will one of the other packages listed there depend on this new version?

daf commented 9 years ago

@mhidas, Wicken depends on it and will pull in the latest version on a new install, but should also be adjusted and a new version released, and then this requirements.txt bumped as well to reflect that. That would ensure proper propagation. Leaving this open until that happens.

lukecampbell commented 8 years ago

Fixed with the unicode changes