Closed morganics closed 4 years ago
If you look at this file: https://github.com/nteract/scrapbook/blob/master/scrapbook/encoders.py arrow is commented out for some reasons.
Yes today you have to convert to/from json which has lots of problems.
This PR: https://github.com/nteract/scrapbook/pull/37 adds pandas and arrow dataframe support, but I had put it on hold for other work. I've wrapped up those other tasks so this is like the next PR / feature I will be working on in the near future to get released.
Just a quick note here - couldn't this be relatively quickly solved by using an encoder such as this:
class DataFrameEncoder(object):
def encode(self, scrap):
# scrap.data is any type, usually specific to the encoder name
scrap = scrap._replace(data=scrap.data.to_dict())
return scrap
def decode(self, scrap):
# scrap.data is one of [None, list, dict, *six.integer_types, *six.string_types]
scrap = scrap._replace(data=pd.DataFrame.from_dict(scrap.data))
return scrap
encoder_registry.register('pandas', DataFrameEncoder())
This allows me to do sb.glue('mydf', 'df', 'pandas')
Maybe not very edge-casey and elegant, but could be a start?
This functionality was merged to master last week, just not released yet -- it uses pyarrow to encode the dataframe in master which is a little better than the to_dict and from_dict.
In this PR https://github.com/nteract/scrapbook/pull/62 (closing as the issue should be resolved)
Surprised that (despite the documentation) support for dataframes doesn't seem to be available - according to the docs you can use the 'arrow' format, but in the code there are a couple of exceptions stating that arrow support is not currently available. I've just used the JSON datatype to save, but obviously not good for larger artifacts.