Open dontcallmedom opened 5 years ago
I have hopefully adjusted the code to the coding style.
I may be missing something for the replace
function - while it would work fine to generate the mark
elements, I don't see how to use it succinctly to escape the HTML characters that would be outside of the matched parts of the string.
I have hopefully adjusted the code to the coding style.
Thanks!
I may be missing something for the
replace
function - while it would work fine to generate themark
elements, I don't see how to use it succinctly to escape the HTML characters that would be outside of the matched parts of the string.
That made me realize that as currently written, this is backwards incompatible: HTML in options works right now, whereas if this is merged, it would stop working and possibly break numerous use cases. Ideally we want something that keeps HTML that was there from the beginning, but doesn't convert special characters to HTML when highlighting them. I.e. something that's XSS-safe, but doesn't restrict what you can do in the initial options. Does that make sense?
re backwards-compatibility, I don't think it can be preserved:
label
- when rendered in the list options, it is rendered as HTML, but when rendered in the <input>
field, it is rendered as plain text (since it goes through the value
attribute).
<mark>
- the XSS risks exists even when no highlighting happens - if the list of label options contains a malicious <script>
(e.g. coming from a third party ajax service, or from an http query parameters), that script will be loaded as soon as the list of labels is displayedI guess we could make the new behavior opt-in - but given the XSS risks and the existing ambiguity, it seems to me making it the default would be better. That being said, I could imagine making it possible to opt-out of the new behavior (e.g. through a new option) - this would imply adapting the way the labels gets displayed in the <input>
to get only the textContent
equivalent.
right now, the API is ambiguous about the type of content in
label
- when rendered in the list options, it is rendered as HTML, but when rendered in the<input>
field, it is rendered as plain text (since it goes through thevalue
attribute).
I believe that's fine, e.g. many use cases are about displaying graphics next to the suggestions.
it's not really a matter of highlighting the content of the label with
<mark>
- the XSS risks exist even when no highlighting happens - if the list of label options contains a malicious<script>
(e.g. coming from a third party ajax service, or from an http query parameters), that script will be loaded as soon as the list of labels is displayed
Good point, that should be prevented (though I'm still not sure if it should be the default at this point). However, if there's HTML in the initial options, included in the page's markup, that should be preserved, no?
I noticed that the current behavior is definitely buggy: If you include <b>
in data-list
, it will be rendered (correct). If you include <b>
, it will still be rendered! That's definitely a bug which needs to be fixed in both cases (regardless of someone opts in to allowing HTML or not).
I guess we could make the new behavior opt-in - but given the XSS risks and the existing ambiguity, it seems to me making it the default would be better. That being said, I could imagine making it possible to opt-out of the new behavior (e.g. through a new option) - this would imply adapting the way the labels gets displayed in the
<input>
to get only thetextContent
equivalent.
I wonder if it would be an easier transition to make it opt-in in this version, and announce that it's going to be default behavior in the next one, and people should use the option now if they need it.
I have looked a bit more into this, and I think there are the following bugs in the way HTML values or labels are handled:
<b>
gets rendered as a <b>
tag)<strong>
will be matched when searching str
); when matched, the markup gets messed up. Conversely, if the list of suggestions includes in-word markup, the word won't be matched (e.g. <strong>J</strong>avascript
, then typing java
won't list Javascript as a suggestion)Here is how I think HTML in the list of suggestions could be properly supported with reduced XSS risks:
data-list
)list: [ node1, node2]
) - the assumption being that it would then be the responsibility of the developer to produce these fragments from strings if that's the behavior they want (e.g. with DOMParser
or innerHTML
) - that way, developers that are aware of the risk of innerHTML
will have greater awareness of possible XSS risks if the strings are unsafeThis would imply adopting the code to handle differently text and DOM fragments, and converting DOM fragments to text via textContent
when setting input.value
and when matching user input. I think (but am not sure) the highlight of matching characters in DOM fragments could still be achieved with the Range
API.
What do you think? (I'm leaving aside the backwards-compatibility question for now - I feel it will be easier to consider it once the end goal is clearer)
fix #16932
Generate manual DOM subtree for each marked instance of the queried string instead of generating an unsafe HTML string