Closed GoogleCodeExporter closed 9 years ago
It would also be best to preserve whitespace as well, since whitespace can
affect
html layouts (we are allowing "pre" formatted text).
Original comment by dob...@gmail.com
on 17 Apr 2009 at 4:53
Here's a patch to allow preserving whitespace and comments, using directives in
the
policy.xml:
{{{
<directive name="formatOutput" value="false"/>
<directive name="omitComments" value="false"/>
<directive name="preserveSpace" value="true"/>
}}}
The patch applies to the antisamy-1.3 svn folder.
Original comment by dob...@gmail.com
on 20 Apr 2009 at 7:25
Attachments:
The only problem with allow comments is that it is possible to execute code
within
them. See http://ha.ckers.org/xss.html.
Yes, the whitespace is my most personally annoying issue. I can't wait to look
at
your patch to see if we can integrate!
Original comment by arshan.d...@gmail.com
on 11 Jun 2009 at 1:21
Original comment by arshan.d...@gmail.com
on 11 Jun 2009 at 1:21
In my case I need the comments, I can possibly strip them out later using a
Regex. I
didn't quite find the case where comments were used for an attack in the link
you
posted, was it just the one where they are injecting a "-->" to close the
comment and
stick in a script tag?
Original comment by dob...@gmail.com
on 11 Jun 2009 at 4:05
Sorry, it wasn't in that link. See issue #5. That wraps a link tag inside of a
conditional comment.
Original comment by arshan.d...@gmail.com
on 3 Aug 2009 at 2:35
Ah I see, well is there a way to remove conditional comments without ditching
the
regular comments?
Original comment by dob...@gmail.com
on 3 Aug 2009 at 5:14
Not easily, unfortunately - conditional comments with nested code are reported
as one
big comment by NekoHTML. You could uncomment the comment-removing code, but
you're
possibly exposing yourself. Could you move AntiSamy downstream of the
templating system?
Original comment by arshan.d...@gmail.com
on 3 Aug 2009 at 5:18
Hi arshan,
I don't want to run AntiSamy every time I process a template, it's too
expensive - I
just use it when they change the template to remove anything unwanted from the
template itself.
I can easily add a bit of regex to strip comments downstream of the templating
system.
Or I can try and make a version of the patch that strips conditional comments,
it
seems pretty doable.
Original comment by dob...@gmail.com
on 3 Aug 2009 at 5:57
I don't want to run AntiSamy every time I render a template, it'd be too
expensive.
I'll adjust the patch to remove any IE conditional directives found inside the
comment
if comments are preserved.
Original comment by dob...@gmail.com
on 3 Aug 2009 at 6:06
Here's a new patch to strip the conditional directives without stripping
comments or
whitespace.
Original comment by dob...@gmail.com
on 3 Aug 2009 at 6:31
Attachments:
I'm integrating the patch. Can you generate test cases?
Original comment by arshan.d...@gmail.com
on 3 Aug 2009 at 7:15
Hi there,
I thought I had included test cases in the patch, for example:
/**
* Issue #41 - support preservation of comments and whitespace, with the caveat
that
* IE conditional directives must be stripped for security purposes.
*/
public void testCommentsAndWhitespace() throws Exception {
assertEquals( "text", as.scan("text <!-- comment --
>",policy).getCleanHTML());
policy.setDirective(Policy.OMIT_COMMENTS, "false");
policy.setDirective(Policy.PRESERVE_SPACE, "true");
policy.setDirective(Policy.FORMAT_OUTPUT, "false");
assertEquals( "<div>text <!-- comment --></div>", as.scan("<div>text <!--
comment --></div>",policy).getCleanHTML());
assertEquals( "<div>text <!-- comment --></div>", as.scan("<div>text <!--[if
IE]> comment <[endif]--></div>",policy).getCleanHTML());
// nested if's. Not handled very well, but as long as there's no tag
injection we're satisfied
assertEquals( "<div>text <!-- <!-- comment --><[endif]--></div>",
as.scan("<div>text <!--[if IE]> <!--[if gte 6]> comment <[endif]--><[endif]--
></div>",policy).getCleanHTML());
// comment inside an if. Not handled very well, but as long as there's no
tag injection we're satisfied
assertEquals( "<div>text <!-- <!-- IE specific --> comment <[endif]--
></div>", as.scan("<div>text <!--[if IE]> <!-- IE specific --> comment
<[endif]--
></div>",policy).getCleanHTML());
assertEquals( "<div>text <!-- \ncomment --></div>", as.scan("<div>text <!-- [
if lte 6 ]>\ncomment <[ endif\n]--></div>",policy).getCleanHTML());
assertEquals( "<div>text comment </div>", as.scan("<div>text <![if !IE]>
comment <![endif]></div>",policy).getCleanHTML());
assertEquals( "<div>text comment </div>", as.scan("<div>text <![ if !IE]>
comment <![endif]></div>",policy).getCleanHTML());
}
Original comment by dob...@gmail.com
on 3 Aug 2009 at 7:48
However, it looks like I may have accidentally included a couple of other fixes
in that
patch, like the CDATA fix and a fix for the z-index test case. Here's a
cleaned-up
version with just the whitespace + comments related code.
Original comment by dob...@gmail.com
on 3 Aug 2009 at 7:52
Attachments:
Your defense is interesting but I'm not confident injection isn't possible.
Consider
the following test case which shows how an attacker can still input conditional
comments through fragmentation:
String attack = "<![if lte 8]alert(1)<![endif]>";
String spacer = "<![if IE]>";
StringBuffer sb = new StringBuffer();
sb.append("<div>text<!--");
for(int i=0;i<attack.length();i++) {
sb.append(attack.charAt(i));
sb.append(spacer);
}
sb.append("-->");
String s = sb.toString();
CleanResults cr = as.scan(s,policy);
System.out.println( cr.getCleanHTML() );
assertTrue( cr.getCleanHTML().indexOf("if") == -11 );
This takes advantage of the "remove the bad stuff inside the string" approach,
and
fragments an attack using what you'll rip out as a spacer. However there is no
tag in
this approach (tags seem to get reliably encoded). So aside from tags, does IE
honor
any other kind of active content inside conditional comments? If it sees
JavaScript,
will it execute? I have to investigate this a little.
Original comment by arshan.d...@gmail.com
on 3 Aug 2009 at 8:26
I think your patch is fine, actually. After some testing, I can't get IE to
recognize
any active content inside the conditional comment. On top of that, I can't get
HTML
control characters to come out decoded/raw in the resulting getCleanHTML().
Original comment by arshan.d...@gmail.com
on 4 Aug 2009 at 1:23
Hi Arshan,
Good catch. After some experimentation I realized that replaceAll() doesn't
make
sense here - the IE parser only seems to accept the condition if it is right at
the
start of the comment with no preceding characters or whitespace.
For example, the following alerts 1 and 5 only:
<html>
<body>
<div>text<!--[if IE]><script>alert(1)</script><![endif]--></div>
<div>text<!--[if IE] ><script>alert(2)</script><![endif]--></div>
<div>text<!--[ if IE]><script>alert(3)</script><![endif]--></div>
<div>text<!-- [if IE]><script>alert(4)</script><![endif]--></div>
<div>text<!--[if IE]><script>alert(5)</script>< ![endif]--></div>
</body>
</html>
For this reason, I think the following algorithm would work, and would not be
subject
to the fragmentation attack:
if ( node instanceof Comment ) {
final String omitComments =
policy.getDirective(Policy.OMIT_COMMENTS);
// Strip conditional directives (for IE only) regardless of the
OMIT_COMMENTS setting
if(omitComments == null || "true".equals(omitComments) ||
((Comment)node).getData().startsWith("[if ")) {
node.getParentNode().removeChild(node);
}
return;
}
If you like I can create and submit a new patch with this algorithm, or you can
paste
that into the appropriate place at the top of recursiveValidateTag(Node node).
Here's the new test case:
public void testCommentsAndWhitespace() throws Exception {
assertEquals( "text", as.scan("text <!-- comment --
>",policy).getCleanHTML());
policy.setDirective(Policy.OMIT_COMMENTS, "false");
policy.setDirective(Policy.PRESERVE_SPACE, "true");
policy.setDirective(Policy.FORMAT_OUTPUT, "false");
assertEquals( "<div>text <!-- comment --></div>", as.scan("<div>text <!--
comment --></div>",policy).getCleanHTML());
assertEquals( "<div>text </div>", as.scan("<div>text <!--[if IE]> comment
<![endif]--></div>",policy).getCleanHTML());
// nested if's. Not handled very well, but as long as there's no tag
injection we're satisfied
assertEquals( "<div>text </div>", as.scan("<div>text <!--[if IE]> <!--[if gte
6]> comment <![endif]--><![endif]--></div>",policy).getCleanHTML());
// comment inside an if. Not handled very well, but as long as there's no
tag injection we're satisfied
assertEquals( "<div>text comment </div>", as.scan("<div>text <!--[if IE]>
<!-- IE specific --> comment <![endif]--></div>",policy).getCleanHTML());
assertEquals( "<div>text comment </div>", as.scan("<div>text <![if !IE]>
comment <![endif]></div>",policy).getCleanHTML());
}
Original comment by dob...@gmail.com
on 4 Aug 2009 at 3:07
Hi Arshan,
Actually I did confirm your fragmentation attack, if your attack string in the
code
above is changed to "[if IE]<script>alert(1)</script><![endif]" then it outputs
HTML
containing a conditional comment with a script tag in it:
<!--[if IE]<script>alert(1)</script><![endif]-->
This isn't "clean" enough ... however, the new algorithm works better.
Original comment by dob...@gmail.com
on 4 Aug 2009 at 3:11
I have IE8 and can confirm that only 1 & 5 fire. I wonder if other browsers have
quirks modes that will allow them to execute these. This might work today, but
what
about tomorrow?
Original comment by arshan.d...@gmail.com
on 4 Aug 2009 at 3:14
It seems like an implementation that allows the "[if" to occur at any point in
the
comment would actually be more work for both the programmers and the computers.
We
could adjust the test to allow whitespace before "[if" but honestly I would be
very
surprised if the IE team changed their system to allow whitespace there.
Additionally processing conditions ANYWHERE in the comments seems even more
absurd -
what if someone wrote a comment to explain the use of the conditional comment
and
accidentally tested the condition instead?
There are all sorts of odd things the IE team can do in the future, and
antisamy can
continue to default to omitting comments anyway, and have a little warning in
the
docs for those who choose to leave comments in that "some unknown attacks
involving
IE conditional comments may be possible or become possible with this enabled".
Original comment by dob...@gmail.com
on 4 Aug 2009 at 3:35
Is the above mentioned patch for allowing comments available? It doesn't appear
to
be in the 1.3 distribution I have. Is there a later release?
Original comment by FrankPap...@gmail.com
on 9 Nov 2009 at 4:36
Forgive me if I'm missing something obvious here...
I cannot get comments to be output, even with the preserveComments directive.
Upon further investigation, the BaseMarkupSerializer seems to ignore the
comment (basically it writes it to the preRoot, but because it's the fragment
that is being serialized, it never writes the preRoot's contents). Can someone
provide an example of HTML where the comments are being preserved (assuming the
directive has been set)?
Original comment by tad...@gmail.com
on 29 Feb 2012 at 4:52
Original issue reported on code.google.com by
dob...@gmail.com
on 17 Apr 2009 at 4:46