WebReflection / basicHTML

A NodeJS based, standard oriented, HTML implementation.
ISC License
126 stars 10 forks source link

Serendipitous benchmark #2

Closed ghost closed 7 years ago

ghost commented 7 years ago

I happened to start looking for a leaner DOM implementation for Node just around the time of your first commit. Here's the benchmark I put together.

WebReflection commented 7 years ago

Awesome! However, while it's encouraging to know basicHTML wins 2 out of 4 and it's competitive in remaining 2, I haven't even thought about performance yet and the usage of forEach instead of usually faster for loops is all over the place.

I have different goals right now so I'd say it's a bit early to benchmark here :wink:

Would you like me to link such benchmark on this README too ?

ghost commented 7 years ago

As you wish really — it'll help me understand the tradeoffs in choosing a DOM-like API on the server, so if it can be any help to others I don't see why not. 😄

(just rewrote it — should be more precise; nice start for basicHTML!)

WebReflection commented 7 years ago

I've discovered today that NativeScript is not compatible with parse5, which is a dependency of this module.

This is a pity, 'cause parse5 seems to be fast enough and very battle tested but if NativeScript won't solve this issue, and since they changed their angular implementation instead of fixing it in the past, I might need to change parser as well, hence my questions?

Thanks for any eventual help, I'm asking you 'cause you're exploring already these projects. I'll also have a look at their parsers, hoping these are not all based on parse5.

Best Regards

ghost commented 7 years ago

I had already in mind to suggest switching to htmlparser2 or a variant thereof (although it does seem to be the best one around). It's a parse5 built-in!

const {parse, treeAdapters: {htmlparser2: treeAdapter}} = require("parse5")
const html = '<!DOCTYPE html><html><head></head><body>Hi there!</body></html>'
const document = parse(html, {treeAdapter: treeAdapter})

While trying to get parse5 to parse undom, I discovered it relies on and allows for custom tree adapters. I ended up patching the default tree adapter to make it trivially support undom. Considering their implementations, even if htmlparser2 doesn't work with NativeScript, implementing a purpose-built tree adapter shouldn't be too much work.

WebReflection commented 7 years ago

uhm ... I actually just need something compatible with XML too ... a simple content with eventually capability ... I think I'll end up writing a stupid one myself or NativeScript, which was the initial reason I even created this repository, won't be happy :cry:

thanks for the hint though, I'll have a better look soon

ghost commented 7 years ago

htmlparser2 is indeed XML compatible. Haven't used it directly for that purpose, but you could try to look into using it without parse5. You're welcome of course!

WebReflection commented 7 years ago

I've just published v0.4.0 which drops parse5 in favor of htmlparser2.

I'd like to know if you could countr-verify the benchmark and tell me if this was an improvement, a regression, or an "it's OK anyway" change.

NativeScript seems to be fine with htmlparser2 though, so I hope it won't be a regression

ghost commented 7 years ago

https://github.com/flagello/node-dom-benchmarks/commit/f55fd1501596d67b9140a6a71d5397ec4646522e

Better in parse-cached and serialize-cached — others slightly better or more or less the same. Should be great given NativeScript support!

WebReflection commented 7 years ago

I'm not sure I understand that benchmark .... but thanks!

ghost commented 7 years ago

I'm all ears!

The tests are pretty barebones for the time being, and I've refactored it twice with Benchmark.js because Benchmark.suite does not have setup and teardown which I thought would lead to a more accurate and modular approach.

I'd be very interested in improvements and suggestions. endom and html-element do not use a parser for the innerHTML setter (parse and parse-cached are skewed), and I'm considering extending endom with one, if it makes sense for the specific use-case I'm exploring.

WebReflection commented 7 years ago

endom and html-element do not use a parser for the innerHTML setter

yeah, I've checked them both and html-element looked like some sort of a joke compared to others, specially because of these comments: https://github.com/1N50MN14/html-element/blob/master/index.js#L215-L222

... parsing is hard and will need added deps!

I'd like to PR a "you don't say" there :laughing:

Anyway, it looks like htmlparser2 is even a better choice than parse5, for what I need here, so thanks for the hint :+1:

ghost commented 7 years ago

Haha, I guess he was in a rush. 😛

You're welcome! 👍