servo / html5ever

High-performance browser-grade HTML5 parser
Other
2.1k stars 215 forks source link

Remove redundant FIXME comment #508

Closed organic1337 closed 1 year ago

organic1337 commented 1 year ago

Background

While searching for potential points in the code to which I could contribute I came across this FIXME comment which seemed like an easy-win:

        // FIXME: linear time search, do we care?

(line 511 in html5ever/src/tokenizer/mod.rs)

Fix attempt

I Tries using a HashSet for the attributes names so the search's complexity would be O(1) instead of O(n), as can be seen in this commit

and once I ran some benchmarks on multiple websites (by modifying the noop-tokenizer example) I realized I only made things worst, the Parsing process took a tiny bit longer than before. This is probably because of the overhead within maintaining a HashSet of the attributes names, and because most elements do not have enough attributes for the HashSet to improve performance.

Conclusion

I think it's a good Idea to remove this comment so no one else will try to optimize this section for no reason.

jdm commented 1 year ago

Thanks for the analysis!