google-code-export / prado3

Automatically exported from code.google.com/p/prado3
Other
4 stars 3 forks source link

TJavaScript::encode() provides no simple way to pass strings in an XSS-safe way #391

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
With the current implementation of TJavaScript::encode() there's no simple and 
safe way to pass string-type parameters to client objects without risking a 
serious script injection vulnerability. Said method (which is also used to 
render client-side registration code of active controls) unconditionally checks 
every string type parameter whether they begin with a "javascript:" literal, 
and if they do, then emits them not as quoted JavaScript strings, but as 
JavaScript code literals, which then get executed by the browser. 

This is the intended way of working, however, this also makes it impossible to 
pass non-restricted and/or possibly user-supplied strings as separate and 
distinct fields or data entities (for ex. as a member of an array) safely to 
client objects - as obviously any user-supplied data/string beginning with 
"javascript:" could cause unintentional disruption of the generated client-side 
code and/or be used by malicious users to inject arbitrary JavaScript snippets 
into the pages served up by Prado.

I therefore propose an extension of the TJavaScript class with a safeString() 
method for "escaping" strings that are always, under all conditions to be 
passed quoted to the client-side, and shouldn't be ever checked for the 
"javascript:" prefix, and then interpreted as JavaScript code snippets.

I've attached a patched TJavaScript.php (which also contains the fix for issue 
390) to solve the issue. 

Using a class with a __toString() magic method has the nice side-effect that 
strings "escaped" using the TJavaScript::safeString() method will keep working 
also with old code, because if treated/used as string they will behave as if 
they would be indeed regular string parameters.

Original issue reported on code.google.com by google...@pcforum.hu on 8 Mar 2012 at 12:59

Attachments:

GoogleCodeExporter commented 9 years ago
Sorry, I was wrong. There actually ARE standard Prado controls that are indeed 
vulnerable to said XSS attack. Validators and some other controls are prone to 
it.

Please find the attached PoC exploit that shows how easy it is for the user to 
inject arbitrary code into a Prado-generated page, if for ex. the developer 
displays a validator message somewhere that begins with some user-supplied text 
parameter.

Now, obviously a user won't attack himself with the exploit shown in the PoC 
page (which serves up the injected code only to the same user that supplied 
that to the page in the first place), however, it's not unrealistic to assume 
that in a more complex real-world web application or web page error messages 
displayed by validators might contain pieces of informations that are supplied 
by different user(s) than the user(s) the validator error is shown to. 

For ex. on a forum a user might chose a nickname of "javascript: <codehere>" 
when registering, and disable reception of private messages, so when another 
user tries to send a message to him, the system might want to display an error 
message like "<target user> does not want to receive messages" through a 
validator, but when it tries to do so, it will actually inject the unescaped 
<codehere> code into page served up to the victim, whose browser will execute 
it - just like in the demonstration.

The obvious fix for the issue demonstrated in the PoC is to commit my patched 
TJavaScript.php file and then change TBaseValidator.php around line 159 from

        $options['ErrorMessage'] = $this->getErrorMessage();

to

        $options['ErrorMessage'] = TJavaScript::safeString($this->getErrorMessage());

in which case the error message will be always safely escaped.

TRequiredFieldValidator's InitialValue, TValidationSummary's HeaderText and 
TSlider's Value's properties might be also prone to the same vulnerability, as 
they typically might contain/begin with user-supplied text fragments.

Original comment by google...@pcforum.hu on 12 Mar 2012 at 3:55

Attachments:

GoogleCodeExporter commented 9 years ago
Fixed in r3116. The idea of using a class to enclose a "marked unsafe" string 
was a really clever idea, but i preferred to reverse the approach: never let 
any string beginning with "javascript: " bypass the encoding, unless the 
developer explicitly marks it as safe using TJavascript;;quoteFunction(). this 
way we should avoid a lot of instances of this same issue on third party 
controls, or in places where the developer wouldn't expect it to happen. I only 
found TSlider using the "javascript:" trickery to push javascript directly 
inside clientside options and patched it accordingly. If i missed any other 
control, we'll just need to patch it.
Sorry for the delay in patching this severe issue: i probably discarded the 
original notification email, and woke up only when i saw the report on bugtraq.

Original comment by ctrlal...@gmail.com on 23 Mar 2012 at 6:21

GoogleCodeExporter commented 9 years ago
This while-list approach is safer, but this inherently introduces a problem 
where you practically can't include any JavaScript code in the markup, and need 
to add custom PHP code to the code-behind class to do so.

So if you choose this solution, then I propose to extend the markup syntax and 
parser in some way, which would enable the developer to signal that a property 
value (which will be obviously always represented as a string literal, or 
converted to one even if using the property="<%= expression %>" format) should 
be treated as a JavaScript literal instead (ie. constructed as an instance of 
TJavaScriptFunction). 

I'm not sure what the best approach would be (a syntax like 'property*="js 
code"' or something similar comes to my mind as a first idea, even though I'm 
not sure whether the XML parser can handle that properly), but we definitively 
need a special markup for this, otherwise with your solution we'll eliminate 
this existing functionality in the framework (ie. that with the "javascript:" 
prefix you can embed literal JavaScript code right in the template). I'm using 
this construct in a lot of my components/pages, so it definitely very usable 
and needed.

Also, maybe TJavaScriptLiteral might be a better name for the class than 
TJavaScriptFunction, since the code represented in it isn't neccessarily a 
function (or actually, it almost never is), just simply a JavaScript 
expression. So the current name might be misleading and make things harder to 
interpret when reading the code.

Original comment by google...@pcforum.hu on 23 Mar 2012 at 7:39

GoogleCodeExporter commented 9 years ago
Btw. if the XML parser can't handle "property*" syntax, then something like 
"property-js" or similar pseud-property-names could be used to signal that a 
property's value should be constructed and set as a TJavaScriptLiteral, instead 
as a simple string.

Also, regardless of the markup syntax the TPropertyValue::ensureString() method 
needs to be modified to not force-convert TJavaScriptLiteral values to strings 
- otherwise you won't be able to set any property to a TJavaScriptLiteral, 
because they all will be converted to simple strings prior to being stored in 
the viewstate, and will be always emitted as string literals in the JavaScript 
snippets, too.

Original comment by google...@pcforum.hu on 23 Mar 2012 at 7:58

GoogleCodeExporter commented 9 years ago
Can you create an example on how you use javascript in properties (or point me 
to some code in prado that uses them) ? i just want to be sure that, if a patch 
gets commited, it satisfies these needs.

Original comment by ctrlal...@gmail.com on 24 Mar 2012 at 10:01

GoogleCodeExporter commented 9 years ago
Sent you an email.

Original comment by google...@pcforum.hu on 24 Mar 2012 at 12:45

GoogleCodeExporter commented 9 years ago
Btw. please a have a look at issue 181. It's marked as fixed, but it isn't. 
(Well, it is, because I provided a fix, but that isn't commited yet - and since 
you said you didn't get any notifications lately, I guess you didn't see that 
one eiher.)

Original comment by google...@pcforum.hu on 24 Mar 2012 at 12:47

GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
http://code.google.com/p/prado3/issues/detail?id=391#c8

I just got another idea. Since it makes does not sense for all string 
properties to accept a JavaScript literal (only for those that are actually 
rendered as part of a client option set or otherwise emitted as part of a 
JavaScript code segment), the framework and the components should signal 
whether a property can actually support JavaScript literals besides simple 
strings.

I think there of something like the event-handling/hooking mechanism, which 
automatically detects whether there's an 'onXXX' function declared in the 
class, and if there's, then an assignment to the "OnXXX" property actually 
adds an event handler to the XXX set of event handlers, instead of setting 
the value of the XXX property or calling the setXXX() setter method (that is 
the case with regular properties, ie. whose name do not follow the OnXXX 
pattern).

Now if we'd have for ex. getters and setters named like getXXXJS() and 
setXXXJS(), then the framework (the __get and __set magic methods) could 
automatically recognize that XXX is a property that can accept both simple 
strings and literal JavaScript code snippets. Only such properties should be 
supported and recognized in the previously suggested extended form in 
templates and trying to use and set the extended form of a property that has 
no setXXXJS() - only setXXX() - associated with, should fail with an unknown 
property exception.

Mentioning that I also have reconsidered using special property names like 
"XXX*" or "XXX-js" to set properties as JavaScript literals, and I suggest 
we use the XXXJS form instead.

So for ex. considering the following control/class declaration:

class TMyControl extends TControl
{
    public function getWhateverOptionJS()
    {
        .. get code here
    }

    public function setWhateverOptionJS($value)
    {
        .. set code here
    }

    public function getAnotherOption()
    {
        .. get code here
    }

    public function setAnotherOption($value)
    {
        .. set code here
    }

    public getClientOptions()
    {
         ...
         $options['Whatever'] = $this->getWhateverOptionJS();
         ...
    }

    /* Another Option property is used somewhere in the class, but does not */
    /* get rendered as a part of a JavaScript block, so it can't be a JS literal */

}

the template code

<com:TMyControl WhateverOption="abcd" />

would be valid, and would set the simple string literal 'abcd' as the value 
of the WhateverOptions property (through passing (string)'abcd' to the 
setWhateverOptionsJS() method). In this case when emitting the property's 
value TJavaScript::encode() WILL escape the 'abcd' string prior to rendering 
it as part of some JavaScript block. The template code

<com:TMyControl WhateverOptionJS="abcd" />

would be also valid, and would set the 'abcd' JavaScript literal (ie. that 
string value encapsulated in a TJavaScriptLiteral object) as the value of 
WhateverOptions property (through passing "new TJavaScriptLiteral('abcd')" 
to the setWhateverOptionsJS() method). In this case when emitting 'abcd' it 
would NOT be escaped by TJavaScript::encode() as it is wrapped in a 
TJavaScriptLiteral object. However, the construct

<com:TMyControl AnotherOptionJS="abcd" />

would fail with an unknown property exception, and only

<com:TMyControl AnotherOption="abcd" />

would be valid, which would set the simple string literal 'abcd' as the 
value of the AnotherOption property.

The __get and __set magic methods would be responsible for recognizing the 
XXXJS property references (both when trying to get and set those) and 
auto-convert to/from forced string values when using the simpler XXX form 
(and there are getXXXJS() and/or setXXXJS() methods defined in the class). 
Of course when referencing the properties in their XXXJS form, the __set 
method should check whether the value is actually a TJavaScripLiteral and if 
it's not, then it should wrap the passed string automatically in a 
TJavaScriptLiteral class prior to passing it to the setXXXJS() function.

This way we could ensure that developers could use simple XXX property 
assignments in an XSS safe way, and if they intend to actually use 
JavaScript literals in properties that are passed to the client-side (or 
otherwise used as part of some JS code), then by setting the XXXJS property 
instead of the simple XXX form, they could signal to the framework that they 
want to pass literal JavaScript codes and are aware of the possible 
cross-site scripting consequences of doing so.

The template parser and the setSubProperty() method (mentioned in my 
previous email) wouldn't have to be changed at all.

I think this would be the simplest and cleanest solution. Of course all 
special hacks in the TJavaScript::encode() method (like checking for '{'-'}' 
and '['-']' pairs) should also be removed, and only the TJavaScripLiteral 
vs. simple string distinction should be left in-tact from the modifications 
proposed/already committed.

TPropertyValue::ensureString() would also still need to be extended to accept
TJavaScriptLiteral wrappers (and leave them intact, instead of forcing them to 
strings). Or maybe we'd need another ensureXXX() method - even though I see no
real advantage in that, as when not used in TJavaScript::encode() parameters,
the TJavaScriptLiteral with the __toString() magic method should behave in 
99.9999% of the cases as a regular string when referenced as such, and I see
no use-case where allowing TPropertyValue::ensureString() to also return a
TJavaScriptLiteral if passed to it (instead of juggling it to a string type var)
could cause any issues. But maybe I'm wrong.

Also, of course feel to free to recommend any other property on 
getter/setter naming pattern, if you have a better idea than XXXJS, 
getXXXJS(), and setXXXJS(), but I think the basic idea should follow the 
design outlined above. 

Original comment by google...@pcforum.hu on 24 Mar 2012 at 7:58

GoogleCodeExporter commented 9 years ago
I committed a second patch in r3118. Thus patch makes the following changes:
- TJavascriptFunction gets renamed to TJavascriptLiteral;
- TJavascript::isFunction/quoteFunction gets renamed accordingly to 
isJsLiteral/quoteJsLiteral;
- TJavascript::encode doesn't treat anymore [ - ] and { - } as special blocks 
that doesn't needs to be encoded;
- adapted controls previously taking advantage of the [ - ] and { - } encoding 
bypass
- added the ability to prefix properties and subproperties with "js" to avoid 
them from being encoded when used as javascript parameters.
In this commit i used a js prefix instead of a suffix. This was either easier 
to implement, more coherent with the on* scheme, and imho it looks better on 
camelCased properties names: jsPropertyName

Original comment by ctrlal...@gmail.com on 24 Mar 2012 at 11:00

GoogleCodeExporter commented 9 years ago
Nice. But this isn't exactly what I've described above. The problem is: now any 
string property can have a TJavaScriptLiteral-wrapped value assigned to it 
through the jsXXX="abcd" syntax - even if that makes no sense there. For ex. 
where CssClass gets rendered as part of the HTML markup (and not passed 
throught getClientOptions() to a client-side wrapper) it makes no sense to 
allow setting jsCssClass="abcd".

What I was proposing was to 
1. modify the __get() and __set() - and also maybe __call() - methods 
(TTemplateManager wouldn't have to be touched at all) and check there for the 
"js" prefix
2. rename all getters and setters to the getjsXXX() / setjsXXX() form which 
would signal to the framework, that XXX can be set both as a TJavaScriptLiteral 
and as a simple string (when referenced only in the XXX form, without the jsXXX 
prefix).

Give me a hour to code that down - it's faster than me trying to explain it 
again.

Things to consider in the meantime:
1. framework should be checked whether it has checks in-place for those special 
"JS-capable" properties the take the form of "if ($this->XXX)", "if ($s = 
$this->XXX)", "$s = $this->X; if ($s)" or something similar. These expressions 
will evaluate differently fro TJavaScriptLiteral-wrapped strings than for 
normal ones (because even if the wrapper will contain an empty string, it will 
still be an object instance, and thus "if ($expr)" will always evaluate to 
true. The solution is to replace all "if ($this->XXX)" check with "if 
($this->XXX!='')" which will work evaluate always the same way regardless of 
the property's value being a wrapper or a simple string.

2. Documentation needs to be extended to reflect the changes regarding the new 
syntax. Ie. class descriptions/documentation should make clear which properties 
can be referenced both in the XXX="abcd" and jsXXX="abcd" form.

3. Since the method described above will now use getjsXXX() and setjsXXX() 
method declarations, only those methods will present in the auto-generated 
class documentation. However, it would be nice if there would be 
pseudo-documentation generated for un-prefixed getters and setters, too. Maybe 
phpDocumentor allows that (ie. generation of documentation for methods that do 
not actually exist in the class declaration, but might be described in class 
annotations) - this should be checked.

Ok, will now begin to code the above-described solution which will ensure that 
not all properties can be set through the jsXXX="abcd" form, only those, that 
do actually support that because of the way they get used by the component 
later.

Original comment by google...@pcforum.hu on 25 Mar 2012 at 8:27

GoogleCodeExporter commented 9 years ago
Ok. Here it is. Exactly 60 minutes after. Damn, that was a good estimate. :)

Original comment by google...@pcforum.hu on 25 Mar 2012 at 9:29

Attachments:

GoogleCodeExporter commented 9 years ago
The changes are:
1. I've moved the declaration of TJavaScriptLiteral to TComponent.php, because 
that (ie. TComponent) is now the most basic class referencing it, which would 
possibly fail to work properly without TJavaScript included if declaration of 
said class wasn't right there
2. I've added a forced type-juggle to TJavaScriptLiteral's __toString(), just 
for the sake of clarity. 
3. To enable the jsXXX="abcd" property reference form now you've to declare 
getjsXXX() and setjsXXX() methods on a component. If a component has no 
setjsXXX() method any attempt to set the jsXXX property will raise a 
non-existing property exception. 
4. However, if a control/object has said getjsXXX() and/or setjsXXX() methods, 
the framework will now automatically allow referencing said property also in 
the XXX="abcd" form in templates, and also $control->XXX = 'abcd'; and 
$control->getXXX() and $control->setXXX() from raw PHP code. Latters will work 
even if you don't have getXXX() and setXXX() methods defined.
5. The difference between the XXX="abcd" / $control->XXX = 'abcd' / 
$control->setXXX('abcd') and the jsXXX="abcd" / $control->jsXXX = 'abcd' / 
$control->setjsXXX('abcd') forms will be, that when using the jsXXX form the 
passed parameter will be always converted to a TJavaScriptLiteral if it isn't 
already one. On the other side if you're using the XXX form but pass actually a 
TJavaScriptLiteral instance instead of a simple string, that won't be 
force-converted to a string. As construction of a TJavaScriptLiteral won't 
happen by accident I don't think this could cause a security issue - however, 
it will preserve the type of the expression when using something like 
$ctrl1->XXX = $ctrl2->YYY, when both $ctrl1 and $ctrl2 also allow the jsXXX 
form for XXX and YYY. I wasn't sure whether there are any such constructs used 
in the framework, but wanted to be on the safe-side (ie. that if it ever copies 
property values this way between objects, then TJavaScriptLiteral-values won't 
be force-converted to string literals during such an operation).
6. I've also cleaned up the TAccordion and TTabPanel controls' 
getClientOptions() method which was getting overcomplicated.

Please review the patch and let me know whether it works as intended.

Original comment by google...@pcforum.hu on 25 Mar 2012 at 9:53