sphinx-doc / sphinx

The Sphinx documentation generator
https://www.sphinx-doc.org/
Other
6.61k stars 2.13k forks source link

HTML search: sanitise 'searchindex.js' contents #13099

Closed jayaddison closed 2 weeks ago

jayaddison commented 2 weeks ago

Feature or Bugfix

Purpose

Detail

Relates

jayaddison commented 2 weeks ago

NB / disclaimer: 015ecb901be21379c659301d3576d7c95146dd6d breaks the same-searchindex-format contract, so this changeset becomes incompatible with existing searchindex.js files from that point.

jayaddison commented 2 weeks ago

I have to be honest: I'm not sure that I can achieve this realistically within reasonable searchindex.js size, page load performance, and code complexity bounds.

I'd be reluctant to close/leave an attempt to implement some (what I consider) safety improvements unfinished -- so in fact I probably will work more on this during the week -- but I have doubts about whether it can be completed in a satisfying way.

jayaddison commented 2 weeks ago

The most recent two performance traces I've run using Firefox and Chromium respectively put the traced duration for setIndex at ~76ms and ~72ms, respectively. That's for the Sphinx self-built documentation, but with a recent copy of the Python documentation searchindex.js file artifically substituted-in to provide a benchmark workload.

I don't have any further changes planned on this branch currently.

jayaddison commented 2 weeks ago

Despite the fact that the performance overhead introduced by these changes may not affect the perceptible time-to-appearance of search result display for Sphinx HTML project builds, I'm going to close this because that overhead, on aggregate, I think would add a nontrivial resource cost. If the change eliminated a category of security problems, then perhaps that would be worthwhile -- but in this case, we don't have a known problem to resolve.

I will hold the related issue #13098 open, because I think in future there may be methods (most likely Records and Tuples) to achieve search index immutability with minimal runtime performance cost (or perhaps even benefits, if resulting compiler/bytecode optimizations become possible thanks to the immutability).

jayaddison commented 2 weeks ago

(if for some reason the code from this branch does prove useful in future, feel free to continue on from it. in particular I would also recommend considering removing the prototype from Array objects if doing so)