sfu-db / dataprep

Open-source low code data preparation library in python. Collect, clean and visualization your data in python with a few lines of code.
http://dataprep.ai
MIT License
2.04k stars 205 forks source link

Interest in Intake? #293

Open martindurant opened 4 years ago

martindurant commented 4 years ago

(Dask core member here - I found your project because you requested to be included in "powered by dask")

The Intake project provides a data cataloguing and loading layer over many data formats and services. It also contains a rudimentary GUI for browsing those catalogues, and interactively plotting the contents of the contained data sources.

I thought you might be interested in seeing whether there is a possibility for integration of your "connectors" into an Intake catalogue, and of your data exploration tools into the Intake GUI.

jnwang commented 4 years ago

Hi @martindurant ,

Thanks for your great suggestion! We have been looking into it in the last few days. We will get back to you once we have a good answer. At the same time, if you can elaborate on the technical details about the integration, that will be much appreciated.

martindurant commented 4 years ago

I see two main aspects:

dovahcrow commented 4 years ago

Hi @martindurant, thanks for the suggestions. Actually I was thinking of a bi-directional integration of intake while reading the documentation. Basically there will be a shim to let DataPrep.connector read data from intake and also let intake read data from connector.

martindurant commented 4 years ago

I think that's what I meant by referring to using your interactive features with an Intake dataset :) I'm not sure whether the shim would need to be in connector, since Intake will provide you with pandas/dask dataframes already.

martindurant commented 4 years ago

We have an intake community meeting on the first Thursday of each month, if anyone here would like to drop by https://github.com/intake/intake/issues/472

dovahcrow commented 4 years ago

I think that's what I meant by referring to using your interactive features with an Intake dataset :) I'm not sure whether the shim would need to be in connector, since Intake will provide you with pandas/dask dataframes already.

Currently, DataPrep only supports sending restful API requests to a URL endpoint. So I think there should be a shim to enable Connector to have Intake as the data source. On the other hand, I think we can also provide an intake plugin, to loading data from DataPrep.Connector.

dovahcrow commented 4 years ago

We have an intake community meeting on the first Thursday of each month, if anyone here would like to drop by intake/intake#472

Thanks for the invitation! I personally will join the meeting and other team members may also join too.