Unidata / siphon

Siphon - A collection of Python utilities for retrieving atmospheric and oceanic data from remote sources, focusing on being able to retrieve data from Unidata data technologies, such as the THREDDS data server.
https://unidata.github.io/siphon
BSD 3-Clause "New" or "Revised" License
211 stars 75 forks source link

Fixes for TDS 5.0 #112

Open dopplershift opened 7 years ago

dopplershift commented 7 years ago
WARNING:root:No parser found for element TimeUnit
WARNING:root:No parser found for element AltitudeUnits
WARNING:root:attribute type uint not understood.
WARNING:root:attribute type uint not understood.
WARNING:root:attribute type uint not understood.
WARNING:root:attribute type uint not understood.
WARNING:root:attribute type uint not understood.
WARNING:root:attribute type uint not understood.
WARNING:root:attribute type uint not understood.
WARNING:root:attribute type uint not understood.
WARNING:root:attribute type uint not understood.
dopplershift commented 1 year ago

Need to prioritize this given that TDS 5 is now the stable release and Unidata production servers are now using it.

For instance, today running this:

from siphon.catalog import TDSCatalog
cat = TDSCatalog('https://thredds.ucar.edu/thredds/catalog/grib/NCEP/RAP/CONUS_13km/catalog.xml')
ds = cat.latest
ncss = ds.subset()

gives:

Cannot convert "[337]" to int. Keeping type as str.
Cannot convert "[451]" to int. Keeping type as str.
Cannot convert "[]" to int. Keeping type as str.
Cannot convert "[22]" to int. Keeping type as str.
Cannot convert "[1]" to int. Keeping type as str.
Cannot convert "[1]" to int. Keeping type as str.
Cannot convert "[1]" to int. Keeping type as str.
Cannot convert "[6]" to int. Keeping type as str.
Cannot convert "[21]" to int. Keeping type as str.
Cannot convert "[1]" to int. Keeping type as str.
Cannot convert "[1]" to int. Keeping type as str.
Cannot convert "[1]" to int. Keeping type as str.
Cannot convert "[37]" to int. Keeping type as str.
Cannot convert "[2]" to int. Keeping type as str.

(cc @haileyajohnson)

zoj613 commented 1 year ago

Need to prioritize this given that TDS 5 is now the stable release and Unidata production servers are now using it.

For instance, today running this:

from siphon.catalog import TDSCatalog
cat = TDSCatalog('https://thredds.ucar.edu/thredds/catalog/grib/NCEP/RAP/CONUS_13km/catalog.xml')
ds = cat.latest
ncss = ds.subset()

gives:

Cannot convert "[337]" to int. Keeping type as str.
Cannot convert "[451]" to int. Keeping type as str.
Cannot convert "[]" to int. Keeping type as str.
Cannot convert "[22]" to int. Keeping type as str.
Cannot convert "[1]" to int. Keeping type as str.
Cannot convert "[1]" to int. Keeping type as str.
Cannot convert "[1]" to int. Keeping type as str.
Cannot convert "[6]" to int. Keeping type as str.
Cannot convert "[21]" to int. Keeping type as str.
Cannot convert "[1]" to int. Keeping type as str.
Cannot convert "[1]" to int. Keeping type as str.
Cannot convert "[1]" to int. Keeping type as str.
Cannot convert "[37]" to int. Keeping type as str.
Cannot convert "[2]" to int. Keeping type as str.

(cc @haileyajohnson)

IM glad im not the only one who is getting this and it is causing my unit tests to fail because the resulting lat/lon coordinates are strings and strings cant be compared to int/float values. Is there a way around this unexpected behaviour?

dopplershift commented 1 year ago

I'm sure it's a minor parsing problems with dataset.xml. In your unit tests you're examining the parsed dataset.xml?

I'm unlikely to get to this in the short term. Depending on your urgency, you may want to consider submitting a pull request fixing it.

JimiC commented 1 year ago

The problem occurs because the shape attribute is represented as a stringified array (i.e. '[451]'). The root cause starts in https://github.com/Unidata/siphon/blob/198cbb4327fed92386d6df49ff9a2843ed5ee5a2/src/siphon/ncss_dataset.py#L340

Example: `

` from https://thredds.ucar.edu/thredds/ncss/grid/grib/NCEP/GFS/Global_0p25deg/Best/dataset.xml

P.S. A nice improvement at this line would be to parse the type attribute of axis to determine the value type, like typed_vals = self._types.handle_typed_values(axis['shape'], 'shape', axis['type']). You may also want to use a fallback in case axis['type'] does not exist.

Then on https://github.com/Unidata/siphon/blob/198cbb4327fed92386d6df49ff9a2843ed5ee5a2/src/siphon/ncss_dataset.py#L78 and https://github.com/Unidata/siphon/blob/198cbb4327fed92386d6df49ff9a2843ed5ee5a2/src/siphon/ncss_dataset.py#L83 the re.split fails producing the warning message.

If replaced with re.split("[ ,]", re.sub(r"^\[|\]$", "", val)) problem is solved.

Because I'm not familiar with the code-base, I won't attempt a PR since the maintainers may have a better idea how to tackle this issue.

I'm just pointing out were it fails.