Closed femaref closed 11 years ago
Looks good. I'll double-check the sanitization of the style
attribute. I'll try my best to inject malicious CSS into it, and if I can't, I'll probably go ahead and merge. Thanks!
Yup, I'm not sure it wont break. It's not a really thorough sanitization, so it probably will break.
So, just testing your regex (^color:.+?;$) in isolation, I'm finding the following strings match:
color:#ffff00; (this is good, it should match) color: #ffff00; (also good) color: #ffff00; font-size: 100px; (bad, this is malicious CSS)
I propose the following regex instead:
^color: *((#[a-f0-9]{3,6})|([a-z]+));$
What do you think?
On Tue, Mar 19, 2013 at 12:37 PM, Femaref notifications@github.com wrote:
Yup, I'm not sure it wont break. It's not a really thorough sanitization, so it probably will break.
— Reply to this email directly or view it on GitHubhttps://github.com/jarrett/rbbcode/pull/15#issuecomment-15130582 .
Hm, shouldn't .+?
be a non-greedy variant of .+
? but yes, yours should work as well. We should force lower-case names then as well due to the regexp.
Yeah, my thinking is, we use a super-strict regex that's essentially a white-list. We're not trying to allow every possible valid variation on a color rule. Just the ones we generate from our BBCode [color] tags. So something that just looks for our exact representation of the color rule would be best. Which makes me realize, mine should be modified to be less permissive:
^color:((#[a-f0-9]{3,6})|([a-z]+));$
Yes, you are right. I've added the changes. I'll have to improve the parsing as well (to get rid of extra white spaces etc).
Ah, good point. We can add unit tests for all the weird but valid ways users might write a [color] tag, plus all kinds of malicious stuff.
On Tue, Mar 19, 2013 at 1:05 PM, Femaref notifications@github.com wrote:
Yes, you are right. I've added the changes. I'll have to improve the parsing as well (to get rid of extra white spaces etc).
— Reply to this email directly or view it on GitHubhttps://github.com/jarrett/rbbcode/pull/15#issuecomment-15132439 .
Effectively, we should allow two cases: [color=red]
and [color=#ffffff]
(with the special case [color=#fff]
). Maybe we should enforce the patterns in the parser?
Yes, that would be reasonable. Something I struggled with in all the other tags: How permissive should we be about stupid variations? For example:
[color = red] [color= red] [color =red]
I go back and forth on this. Lately my attitude is to just be strict about syntax. Users should be able to read a helpfile on BBCode and see that they can't, e.g., throw in arbitrary spaces. But that does present one potential problem: If somebody is migrating from a legacy parser to RbbCode, and they already have a big corpus of user-supplied markup, will we break that corpus? I.e., what if our syntax is stricter than that of the legacy parser?
On Tue, Mar 19, 2013 at 1:08 PM, Femaref notifications@github.com wrote:
Effectively, we should allow two cases: [color=red] and [color=#ffffff](with the special case [color=#fff]). Maybe we should enforce the pattern in the parser?
— Reply to this email directly or view it on GitHubhttps://github.com/jarrett/rbbcode/pull/15#issuecomment-15132612 .
Even though it's opening pandora's box, what about a legacy mode that allows extra whitespaces around the equals sign? Maybe also emits well-formed, strict bbcode.
Probably more than I'd like to maintain :). The same would have to be done for all tags.
I'm leaning towards this: Continue being pretty strict about input. If anyone needs to support a legacy codebase, they can submit a request. That way, we avoid building features someone might need until the need has actually been demonstrated.
BTW, I just noticed that there's an inconsistency in the strictness of the existing code. For tags defined with #def_tag, we allow uppercase and lowercase. For all other tags, we don't. I wouldn't quite call it a bug, but it is weird and should probably be fixed.
On Tue, Mar 19, 2013 at 6:45 PM, Femaref notifications@github.com wrote:
Even though it's opening pandora's box, what about a legacy mode that allows extra whitespaces around the equals sign?
— Reply to this email directly or view it on GitHubhttps://github.com/jarrett/rbbcode/pull/15#issuecomment-15150869 .
Do you want to move forward with this pull request?
Also, I propose one more change. I just remembered that we shouldn't be using ^
and $
to mark the beginning and end of strings in the regex. Here's a rundown of the vulnerability:
http://homakov.blogspot.com/2012/05/saferweb-injects-in-various-ruby.html
So the preferred way to match the beginning and end of strings in a regex is \A
and \z
.
Let me know if you're still interested in this. If not, I can close out the issue. Thanks!
Sorry, I've been pretty busy with other stuff - feel free to close it.
I've added support for the color tag, it also parses any inner bbcode successfully. Extra precaution has been taken with sanitizing the resulting tag, as it involves inline css.