libdynd / dynd-python

Python exposure of dynd
http://libdynd.org
Other
120 stars 23 forks source link

Better error reporting on bad datashapes #387

Open mrocklin opened 9 years ago

mrocklin commented 9 years ago

I have the following datashape

ds2 = """ var * {
  currently: {
    apparentTemperature: float64,
    dewPoint: float64,
    humidity: float64,
    icon: string,
    precipIntensity: int64,
    precipProbability: int64,
    pressure: float64,
    summary: string,
    temperature: float64,
    time: int64,
    visibility: int64,
    windBearing: int64,
    windSpeed: float64
    },
  daily: {
    data: 1 * {
      apparentTemperatureMax: float64,
      apparentTemperatureMaxTime: int64,
      apparentTemperatureMin: float64,
      apparentTemperatureMinTime: int64,
      cloudCover: int64,
      dewPoint: float64,
      humidity: float64,
      icon: string,
      moonPhase: float64,
      precipIntensity: float64,
      precipIntensityMax: float64,
      precipIntensityMaxTime: int64,
      precipProbability: float64,
      precipType: string,
      pressure: float64,
      summary: string,
      sunriseTime: int64,
      sunsetTime: int64,
      temperatureMax: float64,
      temperatureMaxTime: int64,
      temperatureMin: float64,
      temperatureMinTime: int64,
      time: int64,
      visibility: float64,
      windBearing: int64,
      windSpeed: float64
      }
    },
  flags: {isd_stations: 5 * string, sources: 1 * string, units: string},
  hourly: {
    data: 24 * {
      apparentTemperature: float64,
      cloudCover: ?int64,
      dewPoint: float64,
      humidity: float64,
      icon: string,
      precipIntensity: float64,
      precipProbability: float64,
      precipType: ?string,
      pressure: float64,
      summary: string,
      temperature: float64,
      time: int64,
      visibility: float64,
      windBearing: int64,
      windSpeed: float64
      },
    icon: string,
    summary: string
    },
  latitude: float64,
  longitude: float64,
  offset: int64,
  timezone: string
  }"""

I load data with this datashape and get an error:

from dynd import nd

with open('weather.json') as f:
    text = f.read()

>>> x = nd.parse_json(ds2, text.replace('isd-stations', 'isd_stations'))
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-25-4966077b3c8f> in <module>()
      4     text = f.read()
      5 
----> 6 x = nd.parse_json(ds2, text.replace('isd-stations', 'isd_stations'))

dynd/nd/array.pyx in dynd.nd.array.parse_json (/home/travis/build/libdynd/dynd-python/build/temp.linux-x86_64-3.4/array.cxx:8711)()

ValueError: parse error converting string "0.08" to int64

OK, so one of my int types need to be a float type. Which one?

mrocklin commented 9 years ago

Updated comment above

izaid commented 9 years ago

Okay, so I think there are a few things going on here.

1) I think Datashape and DyND are discovering slightly different types -- maybe just a mismatch in expectations between them, maybe a bug in one or both. I found the following substitutions were needed -- cloudCover: int64 -> float64, precipIntensityMaxTime: int64 -> ?int64, and precipType: string -> ?string. And there still might be other things...

2) This still is not enough to fix the dataset as ?string is not something DyND currently supports, but will (hopefully) soon.

mwiebe commented 9 years ago

In the json parser, the parsing code often catches errors like this and wraps them in a new exception including the parsing context of the line+column numbers. That's missing from the particular spot that is parsing this value.