n-riesco / ijavascript

IJavascript is a javascript kernel for the Jupyter notebook
Other
2.19k stars 185 forks source link

DataFrame Exploring with iJavaScript? #248

Open darkbluestudios opened 3 years ago

darkbluestudios commented 3 years ago

Sorry if this is plain, I just haven't found anything that talks to this much

Within ipython / standard jupyter, there are jupyter widgets for visualizing a dataframe (like out of the box through a table visualization, or like with lux for working with altair / vega / d3 visualizations)

I was curious - are there any thoughts on how others have done data exploration within iJavaScript?

darkbluestudios commented 3 years ago

It feels like I'm just not searching for the right term or I'm missing something fundamental.

I understand this is more of a discussion topic / less than an issue If there is a better place to ask, I'd be curious to learn more.


I'm a little confused about how others render out tables - or if its possible to make them interactive - like qgrid

I made a simple function (through mapping / reducing) that converts it to an html table that I show through $$.html but if i want to change the sorting or render different columns, I change the code - which sends it to the display to re-render again.

This is great for some kinds of analysis because its preserved for when I run the notebook again but are there other options others have done?

darkbluestudios commented 3 years ago

I see there are great options within react like with @nteract/data-explorer but it seems to only be available within python,

n-riesco commented 3 years ago

@darkbluestudios This is the right place to ask questions.

The answer to your question (especially, if it involves widgets or non-standard MIME types) depends on which frontend you're using.

Widgets also require comm-* messages, that IJavascript hasn't implemented yet (#100).

A solution using MIME types that currently works with nteract is ijavascript-plotly. E.g:

var Plotly = require(\"ijavascript-plotly\");

Plotly([{y: [10, 30, 20]}], {title: \"Plotly from IJavascript\"});

Screenshot from 2021-08-09 10-23-05

var values = [
      ['Salaries', 'Office', 'Merchandise', 'Legal', '<b>TOTAL</b>'],
      [1200000, 20000, 80000, 2000, 12120000],
      [1300000, 20000, 70000, 2000, 130902000],
      [1300000, 20000, 120000, 2000, 131222000],
      [1400000, 20000, 90000, 2000, 14102000]]

var data = [{
  type: 'table',
  header: {
    values: [["<b>EXPENSES</b>"], ["<b>Q1</b>"],
                 ["<b>Q2</b>"], ["<b>Q3</b>"], ["<b>Q4</b>"]],
    align: "center",
    line: {width: 1, color: 'black'},
    fill: {color: "grey"},
    font: {family: "Arial", size: 12, color: "white"}
  },
  cells: {
    values: values,
    align: "center",
    line: {color: "black", width: 1},
    font: {family: "Arial", size: 11, color: ["black"]}
  }
}]

Plotly(data);

Screenshot from 2021-08-09 10-24-47

Under the hood, ijavascript-plotly is using IJavascript's API to custom outputs.

I've just tried https://github.com/nteract/data-explorer and it looks like there is a bug and/or I'm not using the spec correctly:

var table = {
  "profile": "tabular-data-resource",
  "name": "resource-name",
  "data": [
    {
      "id": 1,
      "first_name": "Louise"
    },
    {
      "id": 2,
      "first_name": "Julia"
    }
  ],
  "schema": {
    "fields": [
      {
        "name": "id",
        "type": "integer"
      },
      {
        "name": "first_name",
        "type": "string"
      }
    ],
    "primaryKey": "id"
  }
};

$$.mime({
    "application/vnd.dataresource+json": table,
});

Screenshot from 2021-08-09 10-30-45

paulroth3d commented 3 years ago

Thank you @n-riesco for the thoughts here. I think I will need to think a bit on this.

(Unfortunately even with successes - I couldn't getting nteract to work, I'm still having challenges with getting modules to work, even for something like d3. Running something like import('d3').then((module) => {global.d3 = module.d3}); per the feedback in #239 in complaining about no call back specified , nor can I get esm working like from #210, so I feel like I am just going to give it up there. )

I've been playing a bit more with vega-lite, and hadn't considered Plotly. I'll further check it out. Seems to work just fine.

If anyone else might be interested, I put an example on how to use Vega-Lite here

Full Example Notebook here

DanfoJS also seems to be an interesting choice. It works with dataframes and also renders.


So there are a few libraries that I've found in working with DataFrames (like d3 or zebras), not all of them provide datatable rendering.

I found recently this post on pandas equivalents within javascript with a few interesting ideas.

DanfoJS seems to be quite interesting in how well it works, and also does render out tables.

Other notable mentions, but they don't seem to be as popular are dataframe-js and the now defunct Pandas-JS libraries.

n-riesco commented 3 years ago

It sounds like what you're trying to do now is rendering on the server side.

paulroth3d commented 3 years ago

Thank you @n-riesco!

I was hoping to ask your thoughts - instead of rendering on the server, are there other options?

I would like to make something more interactive, say to support pagination / sorting, etc. but it seems like I can:

For d3, i mostly only use the data frame type functionality, but for anyone else interested only d3@6.7 is supported currently within iJavaScript (d3@7 requires modules, and that won't work yet - see the note above)


here was a simple example I put together to pass the data to the front end (Option A)

$$.html(`
<body>
<ul id='root' />
<button id='btn' >Action</button>
<script>
movies=${JSON.stringify(movies.slice(0, 4))};
getRoot=()=>document.querySelector('#root');
addMovie=(movie)=>{
    const movieEl = document.createElement('li');
    getRoot().append(movieEl);
    movieEl.innerText = movie.Title;
    return movieEl;
}
removeMovies=()=>Array.from(getRoot().children).forEach(el => el.remove());
document.querySelector('#btn').onclick=()=>movies.forEach(addMovie);
</script>
</body>
`)
n-riesco commented 3 years ago

I managed to get an example working with the data explorer:

var table = {
  "schema": {
    "fields": [
      { "name": "index", "type": "integer" },
      { "name": "sepal_length", "type": "number" },
      { "name": "sepal_width", "type": "number" },
      { "name": "petal_length", "type": "number" },
      { "name": "petal_width", "type": "number" },
      { "name": "species", "type": "string" }
    ],
    "primaryKey": ["index"],
  },
  "data": [
    {
      "index": 0,
      "sepal_length": 5.1,
      "sepal_width": 3.5,
      "petal_length": 1.4,
      "petal_width": 0.2,
      "species": "setosa"
    },
    {
      "index": 1,
      "sepal_length": 4.9,
      "sepal_width": 3,
      "petal_length": 1.4,
      "petal_width": 0.2,
      "species": "setosa"
    },
    // ...
  ],
};

$$.mime({
    "application/vnd.dataresource+json": table,
});

Screenshot from 2021-08-14 20-42-15


https://data-explorer.nteract.io/

type Data = {
      schema: { fields: [{ name: string, type: string }...], primaryKey: Array<?string> },
      data: [{ key: value },...]
    }
n-riesco commented 3 years ago

Re options A and B:

paulroth3d commented 3 years ago

Thats amazing @n-riesco - could I ask for a bit more detail on how you got the nteract data-explorer up?

(I also include an example for option A below if that might help the idea come through)

I wrote down the steps and an example notebook here: https://gist.github.com/paulroth3d/1180450ca3e3c7867a0bbdf77dcdf405

but I'm curious what you found was able to work for you? (Was there something else needed to register the mimetype application/vnd.dataresource+json in the docs I missed? Any help would be appreciated)

--

Per the options A

It would have problems with the content security policy if I was referencing an external library (although there are options around that too) but why do you think the CSP would have a problem in Jupyter itself is rendering out the javascript?

I put an example notebook that is the simplest example I could think of and it works fine for me in jupyter lab - although not within Github gists though..

for Option B

Yeah, i was afraid of that. I think I noticed that in another thread. The comm messages I think would be required going down that route.

frank-zsy commented 2 years ago

@n-riesco thanks for your work and I am quite new about Jupyter notebook and try to render some charts in notebook.

I also find that I can not render data-explorer in the latest docker image of poad/docker-jupyter:nodejs which uses IJavaScript. Is there anything we need to do to enable render the component?