Open jayaddison opened 2 weeks ago
... Describe alternatives you've considered ...
- The
JSON.rawJSON
function was also considered; this resolves the concerns withJSON.parse
. However: it appears not to be widely-deployed yet (it is not available in Firefox or Safari, for example. It became available in Chrome 114 and Edge 114, both released ~Jun 2023).
Self-correction: JSON.rawJSON
appears to be limited to primitive types; it cannot construct arrays or objects (in fact, this is highlighted by some bold text in the introduction on that page, too. I missed both of those previously).
... Describe alternatives you've considered ...
- The
JSON.rawJSON
function was also considered; this resolves the concerns withJSON.parse
. However: it appears not to be widely-deployed yet (it is not available in Firefox or Safari, for example. It became available in Chrome 114 and Edge 114, both released ~Jun 2023).Self-correction:
JSON.rawJSON
appears to be limited to primitive types; it cannot construct arrays or objects (in fact, this is highlighted by some bold text in the introduction on that page, too. I missed both of those previously).
There is a tc39
(ECMAscript technical committee) proposal for a JSON.parseImmutable(...)
function that appears relevant, though: https://github.com/tc39/proposal-json-parseimmutable
Remove prototypes from objects (with the exception of
Array
objects).
I'm having second thoughts about this exception case. Perhaps we should in fact remove the prototype from Array
objects too (this would require some other small changes). I'll wait to gather feedback about that.
There is a
tc39
(ECMAscript technical committee) proposal for aJSON.parseImmutable(...)
function that appears relevant, though: https://github.com/tc39/proposal-json-parseimmutable
A challenge with this approach would be that a record-and-tuple based representation would no longer be a subset of JSON, so we'd require a parser to adapt our loading of a searchindex from file. I've raised a somewhat-related question/proposal at tc39/proposal-record-tuple#391 (tl;dr - would a single #
character be enough to annote a many-object datastructure as immutable? if so, we could simply adjust our PREFIX
value to Search.setIndex(#
).
FWIW I'm skeptical about this. As you noted, any other JavaScript loaded on the page could still mess with the index (including overwriting it altogether with whatever), no matter how we "freeze" it. I don't really see the use case here?
Reducing the likelihood with which the search index content could be modified or provide access to other parts of the application, either accidentally or maliciously, is essentially what I'm requesting.
User input here necessarily follows code paths that access the datastructure -- and I believe that's safe as-currently-implemented -- but I'd have greater confidence in that if I knew (again, to a reasonable degree; I understand it's not guaranteed) that the search index contents were immutable and absent of prototype references.
Is your feature request related to a problem? Please describe. When Sphinx projects are built as HTML, they generally include a search index file loaded by the included JavaScript search functionality.
Search is by-nature a feature that accepts user-controlled input. I have been considering this recently during #13096, and also as part of a related (not-yet-published) security report I've filed that fortunately I've subsequently concluded contains no actual vulnerability.
I think it could be worthwhile to make the contents of the HTML search index immutable and to remove references to the JavaScript
Object
prototype from them, with the objective of reducing the possibility that user-controlled input could affect the index and therefore the integrity/quality of results.Describe the solution you'd like
searchindex.js
file:Array
objects (?)).Describe alternatives you've considered
searchindex.js
file from JavaScript script to a JSON file.JSON.parse
emits objects that include JavaScript object prototype references and that are mutable (non-frozen), so in isolation this appears not to be a complete solutionJSON.rawJSON
function was also considered; this resolves the concerns withJSON.parse
. However: it appears not to be widely-deployed yet (it is not available in Firefox or Safari, for example. It became available in Chrome 114 and Edge 114, both released ~Jun 2023).searchindex.js
file itself.Additional context I plan to offer a pull request to implement this feature alongside this feature request.
Also, I'm aware that other JavaScript on the page could potentially bypass a mechanism like this; and contrastingly, perhaps there are valid use-cases for modifying the search index at runtime. Despite those, I feel that it could be worth removing the prototypes and making the index read-only, but this could be debatable.
Edit: add a question-mark asking whether allowing
Array
prototype references to be retained is OK or not.