Open scottyhq opened 4 years ago
We inherit the _ipython_display_
method from intake's CatalogEntry
. My guess is that there is some bits of metadata on the stac object that are not parsable by ipython's json parser. We should be able to overwrite this behavior (or fix this upstream with sat-stac).
So, the error is due to datetime
objects in the metadata:
Adding this to the above code results in a more informative error
import json
json.dumps(entry.metadata)
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-1-d97249792b93> in <module>
13 import json
14 #json.dumps(entry)
---> 15 json.dumps(entry.metadata)
/srv/conda/envs/notebook/lib/python3.7/json/__init__.py in dumps(obj, skipkeys, ensure_ascii, check_circular, allow_nan, cls, indent, separators, default, sort_keys, **kw)
229 cls is None and indent is None and separators is None and
230 default is None and not sort_keys and not kw):
--> 231 return _default_encoder.encode(obj)
232 if cls is None:
233 cls = JSONEncoder
/srv/conda/envs/notebook/lib/python3.7/json/encoder.py in encode(self, o)
197 # exceptions aren't as detailed. The list call should be roughly
198 # equivalent to the PySequence_Fast that ''.join() would do.
--> 199 chunks = self.iterencode(o, _one_shot=True)
200 if not isinstance(chunks, (list, tuple)):
201 chunks = list(chunks)
/srv/conda/envs/notebook/lib/python3.7/json/encoder.py in iterencode(self, o, _one_shot)
255 self.key_separator, self.item_separator, self.sort_keys,
256 self.skipkeys, _one_shot)
--> 257 return _iterencode(o, 0)
258
259 def _make_iterencode(markers, _default, _encoder, _indent, _floatstr,
/srv/conda/envs/notebook/lib/python3.7/json/encoder.py in default(self, o)
177
178 """
--> 179 raise TypeError(f'Object of type {o.__class__.__name__} '
180 f'is not JSON serializable')
181
TypeError: Object of type datetime is not JSON serializable
The metadata looks like this:
{'datetime': datetime.datetime(2017, 8, 31, 17, 24, 57, 555491, tzinfo=tzlocal()),
'provider': 'Planet',
'license': 'CC-BY-SA',
'eo:cloud_cover': 2,
'eo:gsd': 3.7,
'eo:sun_azimuth': 145.5,
'eo:sun_elevation': 64.9,
'eo:view_angle': 0.2,
'pl:epsg_code': 32615,
'pl:ground_control': True,
'pl:instrument': 'PS2',
'pl:provider': 'planetscope',
'bbox': [-95.73737276800716,
29.561332400220497,
-95.05332428370095,
30.157560439570304],
'geometry': {'type': 'Polygon',
'coordinates': [[[-95.73737276800716, 30.14525788823348],
[-95.06532619920118, 30.157560439570304],
[-95.05332428370095, 29.57334931237589],
[-95.7214758280382, 29.561332400220497],
[-95.73737276800716, 30.14525788823348]]]},
'date': datetime.date(2017, 8, 31),
'catalog_dir': ''}
The following can fix the issue, but I'm confused as to where this should go in the codebase:
import datetime
def convert_datetime(o):
if isinstance(o, datetime.datetime) or isinstance(o, datetime.date):
return o.__str__()
else:
return o
md = entry.metadata
clean = {k: convert_datetime(v) for k, v in md.items()}
Maybe @ian-r-rose has a suggestion based on this intake pull request https://github.com/intake/intake/pull/327
At what point is JSON encoding required? I'm pretty sure that YAML has no problem with this.
Interesting, I hadn't considered that there would be datetime objects in the metadata.
@martindurant when I added the custom __repr__
I used JSON for the mimetype, with the knowledge that it was a widely-supported one by a variety of frontends (JupyterLab, nteract, etc).
Is there some fuller accounting of the non-JSON-able types that might pop up in the metadata? If not, the basic workaround that @scottyhq points to seems reasonable to me, if a bit fragile. We could provide a default serializer function str()
to the JSON-serialization to try to cover all possible objects that might be in the metadata.
msgpack and yaml are the serialisers of reference in intake; but no, there is no list of expected metadata contents, and some drivers might choose to store more complex things if they are not to be serialised at all.
Thanks for the input. As far as I can tell we are using satstac
to read JSON metadata that starts as strings ("datetime": "2019-10-31T19:02:13.439292+00:00"), but gets converted to datetime
objects - @matthewhanson can confirm: https://github.com/pangeo-data/intake-stac/blob/d2c3f01b2e9931da7b1d87aaa67c7e0c107c5fc7/intake_stac/catalog.py#L207
So another short-term solution is just to not convert the strings for intake metadata.
Running into an error outputting an intake.catalog.local.LocalCatalogEntry in a jupyter notebook. print(entry) works, but display(entry) a
ValueError: Can't clean for JSON
pinging @jhamman and @martindurant for help sorting this one out. I think it's likely a simple fix.