Closed otolabqu closed 5 years ago
Thanks for the test case.
I'm unaware of any URL parser that doesn't treat this as an HTTPS URL with a malformed port.
Have you seen a different behavior?
This does not seem like a risk at all. Only javascript: schemes execute.
While sanitizing HTML, I noticed some javascript-like code coming through, when part of a |href| attribute prefixed with |https://| , such as |click|
Not sure if this is a risk. Chrome does not execute it as JS when clicked on. Still, I thought this would be removed by the sanitizer.
If we remove the https:// part, the sanitizer removes it, as in |click |
A java test to make this happen follows.
|import org.junit.Test; import org.owasp.html.HtmlPolicyBuilder; import org.owasp.html.PolicyFactory; import org.owasp.html.Sanitizers; public class SanitizerUnitTest { @Test public void sanitizeJavascriptHref() { String linkWithJs = "<a href='https://javascript:void(0)' target='_new' >click "; String sanitized = Sanitizers.LINKS.sanitize(linkWithJs); System.out.println(sanitized); } @Test public void sanitizeJavascriptHref2() { PolicyFactory policy = new HtmlPolicyBuilder() .allowElements("a") .allowUrlProtocols("https") .allowAttributes("href").onElements("a") .requireRelNofollowOnLinks() .toFactory(); String linkWithJs = "<a href=\"https://javascript:void%280%29\" rel=\"nofollow\">click"; String safeHTML = policy.sanitize(linkWithJs); System.out.println(safeHTML); } } |
The printout of this test is
|click <a href="https://javascript:void%280%29" rel="nofollow">click |
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/OWASP/java-html-sanitizer/issues/167, or mute the thread https://github.com/notifications/unsubscribe-auth/AAgcCfBMciL8V12Qyqvx3siVGIdGZjh6ks5vddmQgaJpZM4ccgMt.
-- Jim Manico Manicode Security https://www.manicode.com
Thanks for the test case.
I'm unaware of any URL parser that doesn't treat this as an HTTPS URL with a malformed port.
Have you seen a different behavior?
Thanks for replying. No, I haven't seen a different behavior.
Closing as non-actionable.
Per https://url.spec.whatwg.org/#concept-basic-url-parser there are two cases where an output's scheme can be "javascript":
javascript
and the input specifies no scheme.Re the first case, by inspection of the spec, the state machine never transitions back to either scheme start state or scheme state once leaving those states, so this only happens when the buffer contains zero or more ASCII whitespace, and then "javascript:" case-insensitively.
In the second case, I believe this can only happen when a document does something odd like <base href="javascript:...">
and possibly not even then. This library filters out "javascript:" URLs even if a policy is foolish enough to allow <base href>
.
When a document is created as a result of a javascript:
URL, browsers reuse the origin from the document that loaded it so its default base URL does not have scheme javascript
.
https://html.spec.whatwg.org/multipage/origin.html
The Document was created as part of the processing for javascript: URLs
The origin of the active document of the browsing context being navigated when the navigate algorithm was invoked.
Embedders would be wise not to do <base href="javascript:...">
.
While sanitizing HTML, I noticed some javascript-like code coming through, when part of a
href
attribute prefixed withhttps://
, such as<a href='https://javascript:void(0)' target='_new' >click</a>
Not sure if this is a risk. Chrome does not execute it as JS when clicked on. Still, I thought this would be removed by the sanitizer.
If we remove the https:// part, the sanitizer removes it, as in
<a href='javascript:void(0)' target='_new' >click</a>
A java test to make this happen follows.
The printout of this test is