animint / animint2

Animated interactive grammar of graphics
https://animint.github.io/animint2/
61 stars 21 forks source link

htmlwidgets with animint #1

Closed faizan-khan-iit closed 5 years ago

faizan-khan-iit commented 7 years ago

One of the goals of animint2 is to get it to work with htmlwidgets and crosstalk. With that in mind I tried to build a minimal dummy package that saves data from R to a json file and then plots it using JS in a webpage, to mirror animint's functionality - plotteR.

Now I have been trying to get it to work with htmlwidgets but I seem to be doing something wrong. I have made the R and JS bindings but I can't get the plot to render in the browser. Since I am not getting any error messages, it is a bit hard to spot the error.

The package plotteR works as follows:

  1. Calling plotter(mtcars, "scatter", list(x="disp", y="hp")) will save a json file named "new_demo_file" in the inst directory.
  2. The plotter.js file, has two functions: plotter to plot the data, and loader to load the json file.
  3. When we manually open the a.html page, we see the plotted data.

Now to get it to work with htmlwidgets, I have added the R binding so that the name of the saved file is passed on to the widget, which in turn uses the loader function to bind the data to the element. The problem is, I am not getting ant output of any kind. I do not know of a debugger for htmlwidgets which makes it all the more harder.

faizan-khan-iit commented 7 years ago

@cpsievert Could you please help me a bit here? I just can't seem to get it to work.

And is it the right approach to just pass on the saved file name? Or some other approach would be better in case of animint?

cpsievert commented 7 years ago

One of the philosophies of htmlwidgets is that you shouldn't have to explicitly write to disk. It essentially assumes you can store all your data as JSON (via jsonlite::toJSON()) at print time. I forgot how much this is at odds with the animint philosophy :/

That being said, you might be able to hack something together using this approach...try returning the htmlwidget object as print.htmlwidget/knit_print.htmlwidget does a lot of the hard work you...

The downside to this hack that I doubt htmlwidget functions like saveWidget() would ever work, but we could provide our own alternative

faizan-khan-iit commented 7 years ago

Seems like this could take some effort. Some thoughts:

  1. While using JSON data directly, do we need to have all the data in the primary memory the whole time? This contradicts with the animint feature where we only load the subsets when needed. If we do implement this, will it affect the performance during big data sets examples that we already have?
  2. Is there any way we can save the interactive plots like we do with animint now (in case saveWidget() does not work)? I want to ask this because some of the TODO goals in animint might require this (1, 2). If not, should we continue with this and implement other features later (if possible)?

I will try and see if your approach works. Do you know of some sort of debugger that could help me while implementing this? Would be a great help.

faizan-khan-iit commented 7 years ago

@cpsievert I was a bit busy the past two weeks and I could not work on this. Could you explain a bit about your solution? Do you mean something like:

  htmlwidgets:::print.htmlwidget(createWidget( ... ))

This is not working as of yet. I was trying to get some insight into the print.htmlwidget function, but it seems there are no docs for this. Should I pass the whole data instead of the file name?

cpsievert commented 7 years ago

Should I pass the whole data instead of the file name?

Passing the whole data would make the implementation less complicated and more "htmlwidgets-like", but I'm not sure how much work it is and how it would affect performance...

faizan-khan-iit commented 7 years ago

@cpsievert Have you ever tried visNetwork? I was just playing around with it the other day and it seemed a bit slow. I was plotting some network data, dim about 10000 * 2. It took around a minute to plot the network. That might also be because of vis.js. I can't really say.

I think we will need some significant modifications to the current code base if we want to use the widget with this approach. We could write some functions to fetch only the data subsets that the widget needs to display at any time, kind of like we do now in the renderer. That should help with the performance. But again, I don't really have an idea if it possible.

@tdhock Do you have an idea about the performance issues in case we pass all the data directly to the widget like I suggested?

tdhock commented 7 years ago

I'm not sure ...

  1. There are definitely use cases where we don't want to download all the data at the beginning of the plot rendering (that is why we developed the chunking system)

  2. maybe there is a way to pass just the meta-data (plot.json) to the htmlwidget?

cpsievert commented 7 years ago

maybe there is a way to pass just the meta-data (plot.json) to the htmlwidget?

@faizan-khan-iit I'd suggest trying this first ^^^

faizan-khan-iit commented 7 years ago

@cpsievert @tdhock Ok. From what I understood, I take the meta-data about the saved files and pass it to the widget. Then try to load the rest of the data (which is saved on the disk) from inside the widget. Did I get this right?

faizan-khan-iit commented 7 years ago

@cpsievert @tdhock Some updates:

  1. I tried passing all the data directly and it works as expected. No problems here.
  2. When I try to pass just the meta data along with the path to tsv file which contains the data (it is stored in a local directory), it gives me this error. I guess its because htmlwidgets runs on a local server(?). So the solution should be storing the data somewhere that's accessible to the local server. I will see if there is a way to do that with htmlwidgets.
faizan-khan-iit commented 7 years ago

@cpsievert @tdhock I found out kind of a work-around to the above problem. What I do is use the saveWidget function from htmlwidgets to save the plot definition at the same place as the data. Kind of like this:

# save data to directory
write.table(tsv_data, path_to_directory)

# Create htmlwidget and save to the same directory
widget <- htmlwidgets::createWidget( ... )
htmlwidgets::saveWidget(widget, path_to_directory)

So the widget gets saved to the directory where the data was previously saved and it seems to work. However there are some points worth noting:

  1. We can't use the widget without saving because the files are on the local disk and cannot be read by the server. (I am still searching for a way to do that)
  2. I am not sure this sort of approach will work with shiny.
  3. I think we can do most of the stuff we do with animint with this approach without much problem.

@cpsievert Do you see any downfalls to this approach? For example what features from htmlwidgets and related packages will not work if we try this.

tdhock commented 7 years ago

about the local server errors, what browser are you using? chrome blocks local requests unless you give it a special flag at startup. firefox allows local requests. also you may be able to use the servr package in a background process (?) to serve the files on a local file server (even when using chrome). also you may want to file an issue with the htmlwidgets devs -- it shouldn't be this difficult.

cpsievert commented 7 years ago

I have a feeling (2) is gonna lead down a deep rabbit-hole and, without a considerable addition to htmlwidgets (which has been rather idle recently), will require a big hack

Do we have any idea how much effort (1) would take?

faizan-khan-iit commented 7 years ago

@tdhock I tried both chrome and firefox and neither worked for when I was using the local server. When I do save the widget like I mentioned in the previous comment, firefox works fine but chrome still throws the error.

I will try uploading the files to the server. This seems like a nice idea to use servr.

@cpsievert I think (2) is just what we did before, except we are using htmlwidgets to save the html and js files. There should not be any performance trade-offs and it should work with some basic rewiring of the code, not counting shiny support.

As for (1), I can currently list these points:

  1. As we will pass the whole data, our chunking system in the renderer will be redundant. So that will need to be done away with. That will require significant changes in both the renderer and the compiler.
  2. The renderer might need some other tweaks, but it should not be a coding marathon.
  3. The compiler is where we might have more work. Especially our whole saveLayer, storeLayer workflow.

So (1) will require much more effort than (2) (that is, if we use saveWidget or find a way to serve files in the second approach). Also I think some performance reduction is unavoidable when using (1). To justify this, it should have some major pros:

  1. crosstalk support
  2. Linking views between other R packages (like Carson mentioned)
  3. Shiny support is easy once we get the basics working
faizan-khan-iit commented 7 years ago

I just dropped a mail to @timelyportfolio. Hope he can provide some insight into this.

tdhock commented 7 years ago

Faizan for me (1) -- passing the whole data set to htmlwidgets and then forcing the user to wait until it downloads -- is not even an option. there are data sets that are too big to download all at once (without significant user waiting), so IMO we definitely need to keep the chunking system.

faizan-khan-iit commented 7 years ago

@tdhock In (1), I think we can pass the same data that we saved as tsv files to the widget. We could edit the functions so that we pass a big object containing all the data we will need. So the data will still be in chunks, but all the chunks will be in memory all at once, from the start. Will that still be too costly?

Edit: I misunderstood your point. I think you were saying that we will have to download all data subsets at the start, even if we don't use them. That is certainly true, and will require much more time as compared to the current implementation.

tdhock commented 7 years ago

right. to me the chunking system is a very important feature. removing it is a step in the wrong direction. if we can't get the chunking system to work with htmlwidgets, I say we just don't use htmlwidgets.

faizan-khan-iit commented 7 years ago

In that case I think our best option while using htmlwidgets is to upload the files using servr. I'll see what I can manage here.