Open GoogleCodeExporter opened 9 years ago
I'm not sure what the issue here is. The stuff is _designed_ to be parsed as
junk tags by non-IE parsers, and in the particular case of Caja they'll be
discarded or rewritten by the whitelist. (Also, in general, when there's doubt
about how Caja should interpret HTML input, we currently prefer to use the
HTML5 / WHATWG rules.)
Is there a case where Caja fails to sanitize/sandbox/render content properly
due to this?
Is there a specific use case for Caja's parser you have in mind that would
benefit? You say you're parsing HTML email, but are you doing something other
than using the Caja sandbox for the results?
Original comment by kpreid@google.com
on 4 Aug 2014 at 5:41
> The stuff is _designed_ to be parsed as junk tags by non-IE parsers, and in
the particular case of Caja they'll be discarded or rewritten by the whitelist.
I'm not sure I was clear, the *tags themselves* are not being discarded. The
tags are parsed as plain text. That is what I was attempting to illustrate with
my example. e.g. currently:
var example = caja.makeSaxParser({pcdata : function(x) { console.log(x) } });
example('<![if !vml]>foo<![endif]>');
yields the following console output:
<!
[if !vml]
>
foo
<!
[endif]
>
I would expect just:
foo
> Is there a specific use case for Caja's parser you have in mind that would
benefit? You say you're parsing HTML email, but are you doing something other
than using the Caja sandbox for the results?
I am using the sax parser directly, and displaying it's output outside of the
sandbox. I have html input (emails) with downlevel-revealed comments, which I
wish to handle as above.
Original comment by morgan.a...@gmail.com
on 4 Aug 2014 at 11:34
Thank you for the clarification; that's definitely a bug.
Original comment by kpreid@google.com
on 4 Aug 2014 at 11:59
Original issue reported on code.google.com by
morgan.a...@gmail.com
on 19 Jun 2014 at 2:50