Open GoogleCodeExporter opened 8 years ago
If I add code to put the "allowedEmptyTags" in the policy file instead of in
Constants.java so that it is configurable, would you make it part of the
project?
I have to compile my own version anyway. If you are interested in the change,
I'll do it right and make it configurable in the policy file. If not, I'll
just hard-code the changes I need in my own Constants.java. I don't want to go
through the effort if it isn't something you are interested in. Please let me
know.
Thanks,
Jacob Coulter
Jacob.Coulter@gmail.com
Original comment by jacob.co...@gmail.com
on 13 Jul 2010 at 5:46
I went ahead and built my own version with the Constants.java above and it
works as expected. That is the simplest fix for anyone else with the same
problem.
I am still willing to code it right and have the allowable empty tags be in the
policy file if someone will give me an indication that such a change will be
accepted. I don't want to do it if it is just a waste of my time.
thanks,
Jacob.Coulter@gmail.com
Original comment by jacob.co...@gmail.com
on 19 Jul 2010 at 4:13
Original comment by arshan.d...@gmail.com
on 3 Aug 2010 at 12:09
Here is a patch that takes the configuration out of the Constants.java and
places it into the policy file. As I understand it, the philosophy behind the
Policy file is that anything not explicitly stated is disallowed. In keeping
with that philosophy, all policy files will need to include the new
<allowed-empty-tags> section like this:
<allowed-empty-tags>
<literal-list>
<literal value="br"/>
<literal value="hr"/>
<literal value="a"/>
<literal value="img"/>
<literal value="link"/>
<literal value="iframe"/>
<literal value="script"/>
<literal value="object"/>
<literal value="applet"/>
<literal value="frame"/>
<literal value="base"/>
<literal value="param"/>
<literal value="meta"/>
<literal value="input"/>
<literal value="textarea"/>
<literal value="embed"/>
<literal value="basefont"/>
<literal value="col"/>
</literal-list>
</allowed-empty-tags>
I'm available to discuss any issues or concerns.
Thanks,
Jacob Coulter & Mark Oberhaus
Jacob.Coulter@gmail.com
Original comment by jacob.co...@gmail.com
on 11 Jan 2011 at 8:02
Attachments:
I think we'll integrate this and leave the Constants in place in case the
user's policy XML has not been updated to contain the new section.
Original comment by arshan.d...@gmail.com
on 3 Feb 2011 at 8:07
I've struggled with this bug for the better part of the last few hours.
The inspirational problem case is this: a self-enclosed tag like <i/> makes the
rest of the page italicized.
The way I see it, this should be at least made into <i></i>, or better yet
removed. As of now, in both versions, tags in the whitelist like (<br> and
<hr/>) should stay as they are found. Tags that are not in the whitelist and
found to be truly empty should be removed.
There is a slight bug in the NekoHTML SAX parser that means sometimes we can't
tell the difference between <empty/> and <empty></empty>. I tried several ways
of getting around it and have given up. I don't think this will be a big
problem in practice, but I've been wrong before and am always happy to accept
patches.
The behavioral inexactitude is difficult for me to accept, but I don't have
time to dig into it any more. Check out AntiSamyTest.java (search for "issue
#80") if you want to see how I decide if I'm happy (enough) with the behavior
or sync up to head and play with it as it stands.
By the way, this all applies when the useXhtml directive is set to FALSE. If
you want XHTML, you'll get the problematic self-enclosed tags by design.
Thanks for your patch Jacob/Mark.
Original comment by arshan.d...@gmail.com
on 4 Feb 2011 at 6:10
By the way, the fix has been checked into HEAD and will be included in our next
minor release.
Original comment by arshan.d...@gmail.com
on 4 Feb 2011 at 7:36
I'm not sure what happens - environmental weirdness, maybe - but my revision of
your patch got lost and never made it into HEAD. I'll have to re-create.
Original comment by arshan.d...@gmail.com
on 7 Jun 2011 at 6:19
I think the code might have been lost for your AntiSamyTest.java example of
what you want for this issue.
Also, it's interesting that you are running into a NekoHTML issue. The version
that in this code is blowing up Intellij when you try to parse any schema with
this on your classpath. It is an out of date xml parser.
Original comment by jacob.co...@gmail.com
on 12 Sep 2011 at 4:41
Arshan,
Do you plan to release a minor version with this change?
Thanks,
Giriraj.
Original comment by girira...@gmail.com
on 4 Nov 2011 at 11:31
For the impatient folks (like me ;)) here's a workaround that does not require
you to compile your own version of the Constants.java. So using the following
code inside a static initializer of one of your classes that use antisamy the
"td" gets added to the list of allowed empty tags:
if (org.owasp.validator.html.scan.Constants.allowedEmptyTags != null) {
List<String> allowedEmptyTagsList = new ArrayList<String>(Arrays.asList(org.owasp.validator.html.scan.Constants.allowedEmptyTags));
if (!allowedEmptyTagsList.contains("td")) {
allowedEmptyTagsList.add("td");
org.owasp.validator.html.scan.Constants.allowedEmptyTags = allowedEmptyTagsList.toArray(new String[0]);
}
}
Best Regards,
Christian
Original comment by m...@Christian-Schneider.net
on 10 Nov 2011 at 8:51
I'm glad it works for you, but that's not a fix I could live with.
I'd like to get in line requesting a minor release. I'd really like to get off
of my forked version.
Original comment by jacob.co...@gmail.com
on 10 Nov 2011 at 7:22
Original issue reported on code.google.com by
jacob.co...@gmail.com
on 2 Jul 2010 at 7:05Attachments: