dmwm / DAS

Data Aggregation System
11 stars 7 forks source link

Review template quoting, sanitize templates #545

Closed vkuznet closed 11 years ago

vkuznet commented 13 years ago

From #290

I didn't understand the addition of urllib quoting in, for example, das_table.tmpl. Shouldn't you use encodeURIComponent in javascript code / arguments, and urllib when quoting something originating from DAS server itself? To me it seems you are now sometimes quoting javascript itself, not the javascript variable value.

Also I note here that the quoting wasn't added universally everywhere - not in all templates, and not even systematically in the one example I happened to quote, das_table.tmpl. As I wrote before, it looks like every template needs to be sanitised. I can't easily tell which values are safe.

vkuznet commented 13 years ago

valya: Usage of encodeURIComponent function here is inappropriate since it encodes entire URI, rather then parameter values. For instance, let consider two calls from JS code shown in das_table.tmpl (I made stand-alone app to show this feature):

In first case we encode all parameters with encodeURIComponent inside of das_table.tmpl, e.g.

{{{ initialRequest: encodeURIComponent("input=find data&idx=0&limit=10")

}}}

and in this case the server get the following list of arguments

{{{

input kwargs {'input=find data&idx=0&limit=10': ''}

127.0.0.1 - - 966a4d2676b7013bd1fb3e238092cbc550322419Oct/2010:15:05:42] "GET /asearch?input%3Dfind%20data%26idx%3D0%26limit%3D10 HTTP/1.1" 200 236 "" "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_4; en-us) AppleWebKit/533.18.1 (KHTML, like Gecko) Version/5.0.2 Safari/533.18.5"

}}}

In this case, all parameters are smashed into a single string and passed along to the server, while using urllib.quote_plus

{{{ initialRequest: encodeURIComponent("input="+urllib.quote_plus("find data") + "&idx=0&limit=10")

}}}

we get correct argument list on a server side

{{{

input kwargs {'input': 'find data', 'limit': '10', 'idx': '0'}

127.0.0.1 - - 966a4d2676b7013bd1fb3e238092cbc550322419Oct/2010:15:06:30] "GET /asearch?input=find+data&idx=0&limit=10 HTTP/1.1" 200 236 "" "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_4; en-us) AppleWebKit/533.18.1 (KHTML, like Gecko) Version/5.0.2 Safari/533.18.5"

}}}

So, without doubts parameters need to be checked and properly encoded, but in this case usage of JS function is inappropriate.

vkuznet commented 13 years ago

valya: Replying to [ticket:545 valya]:

Also I note here that the quoting wasn't added universally everywhere - not in all templates, and not even systematically in the one example I happened to quote, das_table.tmpl. As I wrote before, it looks like every template needs to be sanitised. I can't easily tell which values are safe.

Lassi, I walk through templates and don't see obvious places which concerns me. I also not sure what do you mean by asking to sanitize every template. The data passed to templates and internal data of DAS server. For instance if in server I define a variable foo=1 and pass it into template which will use as $foo, does it require sanitization? I would appreciate if you'll pick up two distinguished examples and ask specific question which bother you about parameters passed to templates. I provided a comment about usage encodeURIComponent in das_table template. All templates are used to build HTML pages, some of them construct HTML links and my understanding that only those parts need to be sanitized to ensure that constructed HTML links are properly encoded.

vkuznet commented 13 years ago

valya: Lassi I would appreciate if you provide your feedback to my comments. In fact, to make your life easier I did clean the code and remove das_table.tmpl and associated code from DAS web server due to the fact that I no longer use this template. If you can find another example of inappropriate encoding in templates it would be helpful to me.

ghost commented 13 years ago

lat: Everywhere where you produce HTML, you should sanitise / quote the value unless you know for sure it cannot be unsafe (e.g. it's an integer). HTML has two quoting contexts, URLs ({{{%nn}}}) and HTML ({{{&}}}), with subtly different rules what to quote.

Equally whenever you use javascript to generate HTML, you need to quote.

Then finally whenever you make requests from javascript via XHR, you again need to quote the arguments, unless you can without doubt prove that the argument is safe without quoting (e.g. integer).

Note that lack of quoting has been used in numerous attacks. Remember the attacks where people used funny searches - which just happened to Githubissues.

  • Githubissues is a development platform for aggregating issues.