JonyEpsilon / gorilla-repl

A rich REPL for Clojure in the notebook style.
http://gorilla-repl.org
MIT License
887 stars 104 forks source link

Export worksheet to HTML #7

Open JonyEpsilon opened 10 years ago

shriphani commented 10 years ago

Hi, I've been hacking on this the past few days. Here are some things I want to run past you:

  1. Are you comfortable with the resulting HTML being exported to another dir with the contents of resources/css, resources/js and resources/jslib copied there?
  2. Remove the file-save form from emitted html - ok or not ok?
JonyEpsilon commented 10 years ago

Hi Shriphani. Cool, thanks for looking at it!

I had in my mind, I think, a different way of doing it than you're proposing (I've had a peek at your fork). I was thinking of writing something that would render static html given the document. This would give nice clean HTML that would be easy to embed etc and wouldn't require access to the (many) dependencies and js files. I don't really like the way, if we use innerHTML to just grab a snapshot of the DOM, that we end up with all of the CodeMirror editor junk etc embedded in the exported HTML.

I hadn't really gotten as far as figuring out how what I've described would work though! What do you think?

shriphani commented 10 years ago

Your approach makes a lot of sense. Check the most recent version of my fork: https://github.com/shriphani/gorilla-repl/tree/master/src/gorilla_repl . It uses a html template with all the js removed and the relevant CSS inlined. I would have put that html in a different file but for some reason that file can't be seen even if it is placed in resources/public so this hack is what I resorted to. Anyway, the resulting html looks like the report and none of the js stuff works. Enlive (the parsing library) is currently adding some random stuff to the html that it processed so I'll take a look at that later today

JonyEpsilon commented 10 years ago

Doesn't this still grab the HTML from the DOM though - meaning it will come with all of the junk that's in there (like codemirror's many divs)? I think we need a separate rendering stage to do this properly. I'm not sure whether that rendering stage should happen in the browser, or on the clojure side.

shriphani commented 10 years ago

ah so you suggest a toHTML function (like the toClojure function). Makes sense. Let me see how that would work.

JonyEpsilon commented 10 years ago

Yeah, it could work with a toHTML on the client side to generate clean HTML directly from the model.

But I think there might be something in favour of doing it server-side instead. Here we could load the file, parse it and then emit streamlined HTML. The advantage of this approach is that then it could be called from the command line. I'm thinking a common use case might be wanting to export a bunch of worksheets to the web, so this might be the better way.

JonyEpsilon commented 10 years ago

Actually, thinking about this some more we need to consider how this will interact with the new rendering setup (see #60). What doesn't help is that the new rendering setup is still just an idea in my head and not written down anywhere!!!

JonyEpsilon commented 10 years ago

Note to self: an online viewer for gists, like http://bl.ocks.org would be useful too!

shriphani commented 10 years ago

How about the default iPython viewer: http://nbviewer.ipython.org/

On Thu, Mar 20, 2014 at 8:54 AM, Jony Hudson notifications@github.comwrote:

Note to self: an online viewer for gists, like http://bl.ocks.org would be useful too!

Reply to this email directly or view it on GitHubhttps://github.com/JonyEpsilon/gorilla-repl/issues/7#issuecomment-38162878 .

PhD Candidate at Carnegie Mellon University, http://shriphani.com/ http://github.com/shriphani

JonyEpsilon commented 10 years ago

The iPython viewer is really nice, and I'd see the gorilla viewer looking a lot like that, although I suspect there wouldn't be much of there code we could re-use.

shriphani commented 10 years ago

Hi,

I began paying more attention to this again. I put together a clojure parser (using instaparse) for a worksheet file. I think this thing can now hook up to the newer rendering pipeline. Does this look like something acceptable?

https://github.com/shriphani/gorilla-repl/blob/fc0e0ca3a8fb57465d5d22a74e2129c87970e7ca/src/gorilla_repl/worksheet_reader.clj

JonyEpsilon commented 10 years ago

This looks great!

I've been working on an online viewer - which I should get released today hopefully - but I still think the static HTML export would be useful. I guess the next step is to write a bit of code which generates a minimal HTML document from the parsed worksheet. It will probably need a piece of javascript which triggers the renderer once for each output.

shriphani commented 10 years ago

Yes I would also like to have export functionality available since some of my repos are not online and I am not always comfortable uploading results, data and documentation to an external site.

Also, I don't think js is needed to produce the final document (I will of course link to a js lib for syntax highlighting). There is a dict that the renderer puts in the worksheet seems to contain a "content" key that points to the html I am supposed to put in so I will just use that.

shriphani commented 10 years ago

Also, can I take a look at the CSS you are using - I would like the exported HTML to achieve some visual consistency with the viewer.

JonyEpsilon commented 10 years ago

I think you probably will need to run the renderer js - the "content" keys in the rendered output are just HTML snippets, and need to be pieced together by the renderer to make the full HTML output (take a look at the output for a list of things and you'll see what I mean). I structured the renderer code so that this should be very easy though.

The CSS is the standard worksheet css in the css/worksheet.css file. The online viewer also has a little bit of extra css for its own UI in css/viewer.css.

shriphani commented 10 years ago

Hi,

So, I think I have HTML export working well. I handle the html generation from within the parser itself (using hiccup) so no js is needed to render. Using the current version and this worksheet: https://gist.github.com/shriphani/10216947, I get this html output: http://shriphani.com/foo3.html . (essentially, (worksheet-reader/read-worksheet ) will give you an html string which you can spit to a file.

Do you have any opinions on what keystroke I should bind this to?

Thanks.

JonyEpsilon commented 10 years ago

Nice work, this looks neat! I think you're still going to need to deal with rendering the output though. If you try your code on a worksheet that has more complicated output (even just [1 2 3 4], or a plot) then you'll find it doesn't work properly, as the :contents HTML fragment that you're rendering as the output isn't complete in those cases. I think you will need to call the js renderer on the output data structure from within the generated HTML.

Hopefully that makes sense, but if not I can try and put together some example code.

shriphani commented 10 years ago

I see what you mean. I've fixed that: here's the sample worksheet: https://gist.github.com/10284408 and here's the exported html: http://shriphani.com/foo4.html

Right now, a lot of the js needed is just inlined (for the purposes of a standalone html file). If you can put the js etc in a CDN somewhere, I can make a lightweight version as well that just sources the js from there.

Also, do you have any opinion on what keystroke I should bind this to ?

JonyEpsilon commented 10 years ago

That looks great :-) I think the only thing missing is LaTeX support. It should be really easy to add, though. Should just be the bit of code following:

https://github.com/JonyEpsilon/gorilla-repl/blob/develop/resources/public/js/mathJaxViewer.js#L21

and then a single call to typeset everything at once.

I'm not sure about the CDN, as it means thinking carefully about versioning (we'd need to host all versions of the renderer to make sure that old worksheets still display). I'm not sure this is something that I really want to maintain! I propose just going for the standalone solution at the minute. We could always minify the js if need be, although the renderer is only a few lines anyway.

I've no strong preference on keystroke really. I do wonder whether it would be good to have it as a command line thing as well though, so we can write scripts which export a set of worksheets?

shriphani commented 10 years ago

Ok, I've got latex working and I've added an option to export the worksheet: here's the latex output: http://shriphani.com/foo5.html

Also, I had to move markdown rendering to client-side. Let me know if there's anything else missing else I can send you a pull-request

JonyEpsilon commented 10 years ago

This is looking really good. Probably the best thing is to send a PR through and I'll merge it in and do a bit of testing. Thanks for all of your work on this, it's a great feature :-)

dl1ely commented 8 years ago

Sorry, but what is the status of this?

rcarmo commented 7 years ago

I like this a lot, but I was wondering... Since Jupyter notebooks are essentially JSON files with base64-encoded images, why not export to .ipynb and take advantage of the built-in rendering abilities of GitHub and many other sites?