rjatkins / owaspantisamy

Automatically exported from code.google.com/p/owaspantisamy
0 stars 0 forks source link

Slashdot policy does not filter javascript from link anchor's href attribute #4

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. Create a Policy instance from the antisamy-slashdot-1.1.1.xml:
    Policy policy = Policy.getInstance("antisamy-slashdot-1.1.1.xml");

2. Scan the String "<a href=\"javascript:alert('xss')\">link</a>":
    CleanResults cr = (new AntiSamy()).scan(html, policy);

3. The result of cr.getCleanHTML() will contain the original HTML, i.e.,
there will still be JavaScript commands in the anchor's href attribute.

What is the expected output? What do you see instead?
I was expecting to see "link", instead I see the input string, "<a
href=\"javascript:alert('xss')\">link</a>"

What version of the product are you using? On what operating system?
I am using version 1.1.1 on Windows XP with JDK 1.5.0_12.

Please provide any additional information below.
I found a possible solution by expanding

    <regexp name="onsiteURL"
value="([\p{L}\p{N}\\/\.\?=&amp;;\#-~]+|\#(\w)+)"/>

in section <common-regexps> to exclude the colon:

    <regexp name="onsiteURL"
value="([\p{L}\p{N}\\/\.\?=&amp;;\#-~&amp;&amp;[^:]]+|\#(\w)+)"/>

But I am by no means a security expert, so this possible solution needs to
be carfully looked into!

Original issue reported on code.google.com by marc.la...@accelsis.biz on 25 Apr 2008 at 7:52

Attachments:

GoogleCodeExporter commented 9 years ago
I guess this how to fix it:
([\p{L}\p{N}\\/\.\?=&;\#-~]+|\#(\w)+) — allows injection
([\p{L}\p{N}\\/\.\?=\#&;-~]+|\#(\w)+) — secure one

Original comment by designbi...@gmail.com on 30 Apr 2008 at 10:38

GoogleCodeExporter commented 9 years ago
As a security non-expert I'm not sure but I think in some cases the expression 
may be enhanced with z-
space marker.

([\p{L}\p{N}\p{Zs}\\/\.\?=\#&;-~]+|\#(\w)+)

And offsiteURL pattern may look like this one:
(\s)*((ht|f)tp(s?)://|mailto:)[\p{L}\p{N}]+[~\p{L}\p{N}\p{Zs}-_\.@#$%&;:,\?=/\+!
]*(\s)*

to be able to handle urls like "http://www.google.ru/search?q=водка и 
пиво"

Original comment by designbi...@gmail.com on 30 Apr 2008 at 11:49

GoogleCodeExporter commented 9 years ago
I am trying to replicate this now, and having the same extremely strange 
results. The
regular expression does not permit any semicolons which are obviously very 
dangerous
as you can use data:, javascript: and many other protocols that you won't want 
your
users to be able to have. I'm getting similar results and this was not caught 
by my
(admittedly limited) regression testing.

I have added the fix added by designbistro even though I have no idea why the 
regular
expression treats those two strings differently.

Original comment by arshan.d...@gmail.com on 25 May 2008 at 12:46

GoogleCodeExporter commented 9 years ago
I have actually fixed this the right way. The unescaped "-" character was 
acting as a
range indicator rather than the literal character. This has been fixed in the 
policy
files and will be shipped in the next release within the next month (1.2). You 
can
simply prepend a \ character in front of a "-" in the policy files if you want a
short term fix.

Original comment by arshan.d...@gmail.com on 2 Jun 2008 at 7:28

GoogleCodeExporter commented 9 years ago

Original comment by arshan.d...@gmail.com on 2 Jun 2008 at 7:30