altair-viz / vega_datasets

A Python package for online & offline access to vega datasets
MIT License
173 stars 57 forks source link

Trouble reading a few datasets #3

Closed eitanlees closed 6 years ago

eitanlees commented 6 years ago

I got some errors when trying to read the 'miserables', 'us-10m', and 'world-110m' datasets.

For 'miserables' it read: ValueError: arrays must all be same length

and for 'us-10m' and 'world-110m' it read: ValueError: Mixing dicts with non-Series may lead to ambiguous ordering.

jakevdp commented 6 years ago

Thanks!

These datasets happen to be ones that are not well-represented by a dataframe (miserables is a node network, and us-10m/world-100m are a bunch of geo shapefiles).

It would be useful to have better error messages here, as well as info on how they might be loaded (e.g. import json; json.loads(data.miserables.raw()))

jakevdp commented 6 years ago

Question for you: do you think it would make more sense from the user perspective to raise a more informative ValueError for these, or to return a different data object than the user might be expecting? (e.g. data.miserables() could return a dictionary of the parsed JSON).

eitanlees commented 6 years ago

hmmm, good question.

I see the value of both options, but personally I would go with "return a different data object". My thinking is the data function should work with all of the available datasets listed.

jakevdp commented 6 years ago

OK, done in d878a0a4eaa6d46ba6d9a58110c280de1755fdaf