simonw / datasette

An open source multi-tool for exploring and publishing data
https://datasette.io
Apache License 2.0
9.48k stars 680 forks source link

Integration with JupyterLab #370

Open psychemedia opened 5 years ago

psychemedia commented 5 years ago

I just watched a demo video for the JupyterLab Chart Editor which wraps the plotly chart editor app in a JupyterLab panel and lets you open a plotly chart JSON file in that editor. Essentially, it pops an HTML app into a panel in JupyterLab, and I think registers the app as a file viewer for a particular file type. (I'm not completely taken by it, tbh, because it means you can do irreproducible things to the chart definition file, but that's another issue).

JupyterLab extensions can also open files from a dialogue as the iframe/html previewer shows: https://github.com/timkpaine/jupyterlab_iframe.

This made me wonder about what datasette integration with JupyterLab might do.

For example, by right-clicking on a CSV file (for which there is already a CSV table view) in the file browser, offer a View / Run as datasette file viewer option that will:

(? Create a new SQLite db for each CSV file and launch each datasette view on a new port? Or have a JupyterLab (session?) SQLite db that stores all datasette viewed CSVs and runs on a single port?)

As a freebie, the datasette API would allow you to run efficient SQL queries against the file eg using using pandas.read_sql() queries in a notebook in the same space.

Related:

simonw commented 5 years ago

I've been thinking a bit about ways of using Jupyter Notebook more effectively with Datasette (thinks like a publish_dataframes(df1, df2, df3) function which publishes some Pandas dataframes and returns you a URL to a new hosted Datasette instance) but you're right, Jupyter Lab is potentially a much more interesting fit.

psychemedia commented 5 years ago

In terms of integration with pandas, I was pondering two different ways datasette/csvs_to_sqlite integration may work:

The pandas.publish_* idea could be quite interesting though... Would it be useful/fruitful to think about publish_ as a complement to pandas.to_?

psychemedia commented 5 years ago

Another route would be something like creating a datasette IPython magic for notebooks to take a dataframe and easily render it as a datasette. You'd need to run the app in the background rather than block execution in the notebook. Related to that, or to publishing a dataframe in notebook cell for use in other cells in a non-blocking way, there may be cribs in something like https://github.com/micahscopes/nbmultitask .

MichaelTiemannOSC commented 2 years ago

Just watched this video which demonstrates the integration of any webapp into JupyterLab: https://youtu.be/FH1dKKmvFtc

Maybe this is the answer?