npm / marky-markdown

npm's markdown parser
https://www.npmjs.com/package/@npmcorp/marky-markdown
405 stars 71 forks source link

Browser side syntax highlighting #210

Open jorilallo opened 8 years ago

jorilallo commented 8 years ago

Per @ashleygwilliams's tweet, adding a request for this. Based on 8.0 release notes, this might not be possible with current highlighter. Maybe consider having a plugin system that one could potentially use highlight.js instead?

Full disclaimer: I haven't used marky-markdown yet but your feature maps pretty well for my needs and I would love to drop marked but personally I need both server and client side rendering with syntax highlighting.

zeke commented 8 years ago

Back in late 2014 when the new npmjs.com was just coming together, we started with browser-side syntax highlighting usinghightlight.js. We ended up abandoning it because the performance was not great, and when we tested it against the 80,000 or so package readmes at the time, many of the readmes would send highlight.js into an infinite loop and crash the browser!

So we decided to the apply the syntax highlighting classes on the server instead. We started with pygments, but @bcoe was really keen to have an all-JS solution to the problem so we set about the task of getting atom's hightlights to work.

To answer your question, I don't think marky-markdown will easily work in a browser environment any time soon. It has several dependencies that will not work in a browser, namely the compiled oniguruma regex parser. But nothing is impossible..

ashleygwilliams commented 8 years ago

hey @zeke we actually just shipped marky working in the browser. we did this my eliminating the highlights package-- https://github.com/npm/marky-markdown/pull/203, not sure if you saw yet or not.

zeke commented 8 years ago

Will wonders never cease. That's so awesome!

billiegoose commented 8 years ago

This should be doable. The relevant portions are all in render.js AFAIK. The centerpiece of the file is here:

  var parser = MD(mdOptions)
    .use(lazyHeaders)
    .use(emoji, {shortcuts: {}})
    .use(codeWrap)
    .use(expandTabs, {tabWidth: 4})
    .use(githubHeadings, options)
  return parser.render(html)
}

The highlighting function isn't even directly part of it; it specified as an option in mdOptions.highlight which is passed to markdown-it. Currently its hardcoded to highlighter.highlightSync but that could be refactored out. However, that's a bigger change than I'm comfortable making, given the code dynamically loads grammar files and garbage collects them automatically, and creates closures from the highlighter based on language names - all of which makes me think it's been tuned for performance to handle specific conditions at npmjs. I'm afraid I'd inadvertantly cause a performance regression.

billiegoose commented 8 years ago

I know I just said "this should be doable" but I got to thinking just now; from a browser-based point of view, it might not be desirable to do this. Doing the syntax highlighting during the markdown-rendering phase necessarily means delaying rendering the resulting HTML in the DOM. Most of the time this might not matter, but for a large file with lots of code snippets, I'd much rather have it rendered progressively: first with unhighlighted code, so I can at least start reading it, then highlighting the each code blocks. Atom (the text editor) does something like this: if I load a huge file there is a noticeable delay between when the text first appears and when it is highlighted. And I think that is preferable to having to wait for all of the highlighting before getting the result.

Having a plugin system for highlighting libraries is just going to result in tons of wrapper modules around libraries that probably already work fine with HTML input. @jorilallo mentioned highlight.js - it is already designed to run in the DOM on <pre><code> tags. source

revin commented 8 years ago

However, that's a bigger change than I'm comfortable making, given the code ... - all of which makes me think it's been tuned for performance to handle specific conditions at npmjs. I'm afraid I'd inadvertantly cause a performance regression.

Judging by, e.g., this comment, it looks like the grammars are memory intensive on the server side.

@wmhilton what about doing the highlighting, but also allowing for the languages to be specified at runtime?

In the end, as you say, it could be better to simply recommend running highlight.js on the DOM after the page is rendered. I'd be perfectly happy with that.

revin commented 8 years ago

@wmhilton btw, thanks for all the great input recently, it's fantastic to have the help! ✨🐬✨