ccampbell / rainbow

Simple syntax highlighting library written in javascript
http://rainbowco.de
Apache License 2.0
3.3k stars 465 forks source link

Your library encourages unescaped HTML by supporting it in many cases. #249

Closed joshgoebel closed 4 years ago

joshgoebel commented 4 years ago

Reference: https://github.com/highlightjs/highlight.js/issues/2884

Some users get entirely the wrong idea when you support this behavior (highlighting unescaped code). This allows all sorts of potential JS injection attacks for users who don't understand why proper escaping is necessary.

Are you aware of the security implications of encouraging this?

Neohiro79 commented 4 years ago

While I really appreciate the way of trying to fix security vulnerabilities which might potentially coming up, at least this solution is a solution which works in an understandable way (for me).

If I am hosting this on my very own, nothing should happen as far as I am aware of.

Of course, if you fire up the console and put some additional code in it or use some weird ads on your website, you always "have potential security risks". But this is true for every website out there and has nothing to do with this "masking" of "<" ">" elements. This is not a form field which will get pushed via PHP to a server:

Info about Client Side Injection Attacks

I personally don't get the drama here right now - but I might have not the whole picture...

At least Craig's solution is by far the most reliable one when it comes to "highlighting" html code the way I expect it to be highlighted from a highlighter - and additionally he provided the best feedback so far from all "highlighers" out there.

Neohiro79 commented 4 years ago

So yes, Javascript works (in every modern browser nowadays, as far as I know):

https://jsfiddle.net/neohiro79/zbfdhxjy/11/

joshgoebel commented 4 years ago

Yes, but this is potentially very bad:

<script>
alert("You've been hacked, OR WORSE");
// mine some bitcoin, steal some user data, etc.
</script>

And this does nothing (because it's just text, NOT Javascript).

&lt;script&gt;
alert(&quot;You've been hacked, OR WORSE&quot;);
&lt;/script&gt;

I didn't really open this issue for you more for the library author.

I personally don't get the drama here right now - but I might have not the whole picture...

You don't. The key: Not all users are you. This is very dangerous behavior for a highlighting library to be encouraging. There are very real risks when someone DOES use this library with PHP and a server and then highlight some code (just like you did)... and then their website is defaced (or much much worse).

I expect it to be highlighted from a highlighter

HTML is complicated... sometimes just because something is easy doesn't mean there aren't serious risks. :)

joshgoebel commented 4 years ago

@Neohiro79 Just to be clear there are also technical issues with trying to highlighting HTML like that:

<pre><code class="language-html">
Am I &amp; HTML or what?
&quot; Is there a problem?
</code>
</pre>

For example you can't highlight this because &amp; and &quot; will have been converted into & and " by the browser... so it will fail to highlight ALL HTML properly - because it's impossible (without escaping).

Or:

<pre><code class="language-html">
<BOLD>really</BOLD>
</code>
</pre>

This will come out as <bold>really</bold> with all the capitalization being lost... because it's not text, it's HTML and HTML doesn't care about case (so most browsers use lowercase tags).

joshgoebel commented 4 years ago

@Neohiro79 Don't feel bad though, this is a very common confusion. But it's the way it is for good reason. :) And lots of security nightmares have occurred on the web due to people not understanding this who perhaps should have. :)

You might say:

"I live in the boonies I never lock my car, or my house and I leave my huge stash of gold on my front porch - where it's perfectly safe - works for me."

And you might be speaking the truth. :-) But that's also terrible advice for a LOT of people in say urban cities. :-)

Neohiro79 commented 4 years ago

Josh, I am really thankful that you at least gave me a proper answer to my question and also provided me with a workaround.

But I don't get the point of your security concerns. Javascript is as far as I am aware of activated by default in all modern browsers nowadays. So basically EVERY script, even on CDN's might be "compromised" - hence "loaded" into the browser without ANY option to "stop the browser from doing so" - except "denying all javascripts" - which noone really does nowadays.

If I host all my scripts and html files on my server having only potentially some "input-fields" which might get submitted through php itself, I only run into troubles (as far as I am aware of) when you do not use htmlentities to proper ensure that no code which might beeing submitted is potentially beeing executed on the server itself. That would harm your files on the server, yes.

That said, since every client and browser on this planet actually executes javascript without any security measures other than maybe a sandboxed environment, I don't get your point of the implied security concerns in case someone "highlights" the code the browser anyway doesn't execute since it is in a <code> block to show the code in a proper way.


@EDIT: Which was proofed wrong thanx to joshs ongoing resitance, as can be seen here:

BUT IT DOES see: https://jsfiddle.net/neohiro79/zbfdhxjy/14/


I might be missing something here - if so, please give me a hint since I want to know what I might need to look out for when using rainbow.js.

If you're talking about a node.js environment which is executed on the server - well, yes, that might be a complete other topic, but who is using node.js when beeing concerned about security issues, right? ;-)

joshgoebel commented 4 years ago

I'm talking about the HTML environment on the client. I think I covered it with my analogy earlier.

Just because your gold is safe on your front porch doesn't mean putting a huge stack of gold on front porches is a good idea for most people - and perhaps even you will get your gold stolen one day. :-) Same reason you don't run with knives... no one means to trip and impale themselves and die, yet every year it happens to someone. :-)

What if I want to highlight this code:

<script>
alert("This is ONLY to be highlighted, it should NOT actually show an alert in my browser.");
</script>

You can't UNLESS you properly escape the HTML. That is the correct way to do it.

joshgoebel commented 4 years ago

Thankfully there are no examples in the README showing this risky behavior. Hopefully the author will come around eventually and dissuade you. :-)

joshgoebel commented 4 years ago

Here the author is saying the same ting I'm telling you: https://github.com/ccampbell/rainbow/issues/244 :-)

And more issues:

It's like half the issues are about people doing this wrong and the pain as a result.

Neohiro79 commented 4 years ago

What if I want to show someone this snippet:

I just use the "<code>" element, very simple:

<code>

</code>

and hightlight it properly with a "hightligher-script".


@EDIT:

BUT IT DOES see: https://jsfiddle.net/neohiro79/zbfdhxjy/14/


And since I've edited it in here in git, I needed to use those `backticks ` for the <code> element to proper show up, since I used a form-field to "send" it to the server to "show" it to you.

Then, just a personal guess, github is using the `backticks` to properly address those chars with htmlentities / php on the backend once the "comment" button has been pressed to properly "recode" those < > special chars into something which cannot be used in a bad manner.

That's how it should be done. And the "highlighter script" - in my humble opinion - has nothing to do with this. Or should have. At least of what I am aware of.

joshgoebel commented 4 years ago

That is not safe either:

<code>
<script> alert("This is ONLY to be highlighted, it should NOT actually show an alert in my browser."); </script>
</code>
joshgoebel commented 4 years ago

github is using the backticks to properly address those chars with htmlentities / php on the backend once the "comment" button has been pressed to properly "recode" those < > special chars into something which cannot be used in a bad manner.

Yes because if they didn't they would have major security problems.

joshgoebel commented 4 years ago

I am still waiting for the example how rainbow.js is going to be able to "destroy" my website ...

The gold on your porch might be safe... for now... but good luck. :) Because a lot of peoples gold is getting stolen off porches... I'll keep my gold elsewhere.

joshgoebel commented 4 years ago

FYI: Unrelated You use three backticks for multi-line code blocks on GitHub.

This is a code block.

Neohiro79 commented 4 years ago

I think I found out what you're talking about:

https://jsfiddle.net/neohiro79/zbfdhxjy/14/

This is the concern you have, right?

The browser executes Javascript-Code even when it is written in a code-element, that's the point?

Neohiro79 commented 4 years ago

Yes, that's something I wasn't aware of yet, I really thought javascript code inside a codeblock would NOT be executed by default. What a fuckup.

But even then, if I "use" javascript-code to be highlighted, I only need to address this javascript "<script>" tag - no other tags, since Javascript won't execute outside of those script-tags.

Even more, why is it then, that close to all highlighters I tested, only HTML comments made "troubles" - not "CSS" and not "Javascript"?

I will look into this topic further, I hope we can keep up this conversation once I figured it out.

joshgoebel commented 4 years ago

I only need to address this javascript " Githubissues.

  • Githubissues is a development platform for aggregating issues.