ranaroussi / pystore

Fast data store for Pandas time-series data
Apache License 2.0
562 stars 101 forks source link

Intake integration #18

Open martindurant opened 5 years ago

martindurant commented 5 years ago

I am pleased to see another use for dask and fastparquet, meeting the specific use of your users.

I am also the main contributor to Intake, which is a one-stop-shop for finding datasets in catalogs and loading them without the user having to know anything about the specifics of the data format or service.

Pystore already has some capability for hosting a named set-of-datasets, with metadata, and so would fit very nicely in the Intake ecosystem. Indeed, pandas (or dask) dataframes are one of the built-in container types supported by Intake. Would you be interested in writing an Intake driver interface for pystore? That way, these data could take their place among all the other datasets of various types from various services that may be available in an analyst's session.

One place where you are ahead of us is snapshotting, a topic we have discussed, but not yet designed or implemented. Comments at https://github.com/intake/intake/issues/382 would be highly appreciated.