localhost + port isn't safe

cwebber commented 6 years ago

We had to fix a similar issue in Guile last year. Localhost + port isn't actually safe in that it's possible for the browser to act as a "confused deputy" so that lines of commands can be sent to the listening REPL, thus allowing code execution.

cwebber commented 6 years ago

The best solution is to use a unix domain socket (basically connecting through a file which acts as a socket) rather than localhost + port. That should work for GNU/Linux and OSX users, but not sure what to do for Windows users. At the least the old behavior could be left as a fallback, though it might leave Windows users vulnerable.

greghendershott commented 6 years ago

Thanks for opening this issue!

(Note: I edited the first link URL to point to https://lists.gnu.org/archive/html/guile-user/2016-10/msg00007.html which you'd mentioned on IRC, instead of being a dupe of the second link.)

A few quick thoughts:

The "wire protocol" is s-expressions not line-oriented. This might alter the exposure? (But I'm not necessarily saying this reduces it. And I certainly wouldn't claim this eliminates it).
Using domain sockets might be slightly challenging because:
- racket-mode currently supports Racket versions as old as 6.0. (Although that could be increased, some Linux distributions have quite old versions, so I like to try to support them.)
- racket-mode is delivered as an Emacs MELPA package, not as a Racket package. It tries to avoid deps on anything except the drracket-lib package. (It can optionally use some packages by using dynamic-require with some fallback if not found.)
- Racket doesn't "natively" support domain sockets. Looking now, I see there is a unix-domain-sockets package. Great! Unfortunately its info.rkt seems to require Racket 6.12.

So. This may need to be addressed in multiple facets and stages. I'm not sure when it could be out-of-box secure-by-default for most or even many users. 😦 But we could make it better soon, and more-better later.

greghendershott commented 6 years ago

p.s. I neglected to include the obligatory, "At racket-mode corporation, we take security very seriously....".

cwebber commented 6 years ago

The "wire protocol" is s-expressions not line-oriented. This might alter the exposure? (But I'm not necessarily saying this reduces it. And I certainly wouldn't claim this eliminates it).

Yes, I even tried attacking it by opening up a connection with telnet, but I didn't actually know what protocol it used at all :) I tried simply typing some expressions then enter, but it did nothing.

But I don't actually know what the wire format is... maybe if you can point me to it I can more easily test.

greghendershott commented 6 years ago

Note that the connection model is not like HTTP, where the server will accept multiple TCP connections at once, and there is a single request/response per connection.

Instead, the command server accepts just one TCP connection at a time. It reads and writes Emacs Lisp s-expressions. The request sexpr has a unique ID consed on the front. The unique ID is consed back to the response. So multiple commands can be "in flight" at once -- multiple requests waiting for responses -- but over just the one connection that Emacs opened.[^1]

The whole server edge is here. It's a lot of comments (hopefully helpful?) then just a couple dozen lines of Racket code.

These are the various specific commands. Note that here the initial ID is stripped off (it's the cdr of what came over the wire.)

[^1]: The fact that the command server accepts only one connection at a time might mean the exposure is greatly reduced. There's a window when the Racket back end is starting, and the Emacs front end hasn't yet connected. An attack could connect instead? But thereafter, probably not?

greghendershott commented 6 years ago

Not sure if GitHub sends a new email to you for each edit? If not: I edited the URL in the first link for the server "edge" code. Should be: https://github.com/greghendershott/racket-mode/blob/master/cmds.rkt#L119-L175

greghendershott commented 6 years ago

In a shell you can do racket run.rkt 55557 '(run nil 2048 nil low nil)' to run the back end server listening on 55557.

In another shell you can use nc to connect, type command requests, and see responses.

$ nc 127.0.0.1 55557
(1 path+md5)
(1 ok (top . ""))
(2 def "display")
(2 ok kernel)
(3 def "displayln")
(3 ok ("/Applications/Racket_v6.10/collects/racket/private/misc.rkt" 187 10))
(4 describe "display")
(4 ok "<div><span style=\"color: #C0C0C0\"><i>procedure</i><quote>&nbsp;</quote>Provided from: racket/base, racket | Package: base</span><div class=\"SIntrapara\"><span><table cellpadding=\"0\" cellspacing=\"0\" class=\"boxed RBoxed\"><tbody><tr><td><span><div><span class=\"RktPn\">(</span><span></span><span><i><b>display</b></i></span><span class=\"hspace\">&nbsp;</span><i>datum</i><span class=\"hspace\">&nbsp;</span>[<i>out</i>]<span class=\"RktPn\">)</span><span class=\"hspace\">&nbsp;</span>&rarr;<span class=\"hspace\">&nbsp;</span><i><span>void?</span></i></div></span></td></tr><tr><td><span class=\"hspace\">&nbsp;&nbsp;</span><i>datum</i><span class=\"hspace\">&nbsp;</span>:<span class=\"hspace\">&nbsp;</span><i><span>any/c</span></i></td></tr><tr><td><span class=\"hspace\">&nbsp;&nbsp;</span><i>out</i><span class=\"hspace\">&nbsp;</span>:<span class=\"hspace\">&nbsp;</span><i><span>output-port?</span></i><span class=\"hspace\">&nbsp;</span>=<span class=\"hspace\">&nbsp;</span><span class=\"RktPn\">(</span><i><span>current-output-port</span></i><span class=\"RktPn\">)</span></td></tr></tbody></table></span></div><div class=\"SIntrapara\">Displays <i>datum</i> to <i>out</i>, similar to <i><span>write</span></i>,\nbut usually in such a way that byte- and character-based datatypes are\nwritten as raw bytes or characters. If <i>out</i> has a handler\nassociated to it via <i><span>port-display-handler</span></i>, then the handler\nis called. Otherwise, the <span>default printer</span> is used (in <i><span>display</span></i>\nmode), as configured by various parameters.</div><p>See <span>The Printer</span> for more information about the default\nprinter. In particular, note that <i><span>display</span></i> may require memory\nproportional to the depth of the value being printed, due to the\ninitial cycle check.</p><p></p></div>")
(5 exit)
$

So that's a taste of the protocol.

Let's say racket-mode is already running a live REPL. That uses 55555 by default. A connection is open. Now if you try say $ nc -G 3 127.0.0.1 55555 it will timeout after 3 seconds.

I'm not saying everything is fine. But it's possibly not as bad as it could be.

greghendershott commented 6 years ago

Now if you try say $ nc -G 3 127.0.0.1 55555 it will timeout after 3 seconds.

Actually if you use the code on master then nc will just sit there, getting no response, as opposed to clearly timing out. That's because tcp-listen is getting 4 for the max-allow-wait arg. The example I gave above, I had tried setting the value to 1. (Thinking it might be belt+suspenders, in addition to what it already does, a single tcp-accept at a time.)

One thing I do want to change is have the tcp-listen hostname argument be 127.0.0.1 instead of #f.

cwebber commented 6 years ago

I see, thank you for the detailed explaination! I'll have to give it a try later, but it only accepting one connection at a time probably limits the severity of this to a much narrower race condition. Probably something good to fix, but not crazy urgent.

greghendershott commented 6 years ago

As for the sort of mitigation done in that Guile commit, I think the racket-mode server effectively already has that. If it gets a non-sexpr, it closes the connection.

Here's a transcript where the (1 path+md5) request gets a (1 ok (top . "")) response -- but a GET / HTTP/1.1 "command" gets disconnected.

$ nc 127.0.0.1 55557
(1 path+md5)
(1 ok (top . ""))
GET / HTTP/1.1
$

Having said that, it would be good to add a comment here explicitly stating that it's desirable to drop the connection and "see issue #327". So that someday no one goes to "improve" this by making it "more robust".

greghendershott commented 6 years ago

Using domain sockets is safe because that's not something JavaScript code in a browser can do.

What if we accept a normal TCP connection, then challenge the client to do something that JavaScript in a browser cannot do. Is one such thing, writing to an arbitrary local file? If so, could we use that?

For example:

The server sends an opening "banner" sexpr (list (make-temporary-file) (current-seconds)). For instance that might end up being the text '("/tmp/rkttemp1234" 1535661796).
The client must
- read that from TCP
- write it to the specified file
- write it back over TCP
The server
- reads that from TCP, checks that it matches
- reads from the file, checks that matches

Any mismatch or discrepancy, and the server closes the connection.

greghendershott commented 6 years ago

I have a simpler and (I think) stronger proposal.

Assumption: Although a browser enables a remote attacker to use a local network port, it does not enable a remote attacker to run arbitrary programs locally. (If it could, this issue is the least of our worries, right?)
The only valid usage of the command server is when racket-mode's Emacs launches it -- i.e. uses make-comint to do racket run.rkt <params>.
Therefore, the Emacs front-end must pass another command-line argument when it launches the back-end: (list 'auth random-value). Given the threat model, I think it's OK for random-value to come from Emacs' (random) instead of a CSPRNG.
After something connects to the command server, it must send that same s-expression. Unless the launch and connect values match, the server exits. (If they match, the server now accepts commands for the duration of the connection.)

TL;DR: "You may connect if you can demonstrate you are the same entity that launched me."

Thoughts from anyone?

greghendershott commented 6 years ago

Thinking more: The timing business doesn't add any actual value. I edited the previous proposal in GitHub.

cwebber commented 6 years ago

That proposal sounds like it would work to me.

greghendershott commented 6 years ago

The branch with commit f9712b2 -- connection attempts must supply the same "token" used to launch the server process -- is now merged to master.

Does this close the issue?

On one hand, I'd say "yes" because I don't see anything more left to do. I think this does a reasonable job of reducing the threat as I understand it.
On the other hand, marking any security issue as "Fixed!! :tada:" feels like tempting fate.

So I'm going to leave this open for now. (Someday, if there's a new specific variation of this threat? Then I might close this issue, and make a new issue that can benefit from a fresh, focused comment thread.)

greghendershott commented 5 years ago

As a p.s., I think I may have re-invented the X server cookie solution to this problem. :)

greghendershott commented 4 years ago

Just an update that some details have changed -- but I don't believe this changes the substance or risk.

Since commit 417237c a couple months ago, TCP is used for REPL I/O (as opposed to command I/O). I think this change is a wash:
- Either way could execute arbitrary code (e.g. one command is an eval).
- The TCP connection still requires supplying the random authorization code, to demonstrate that the thing making the connection is the same thing that started the back end process.
Since commit 304acf1 a couple weeks ago, the back end picks an available so-called ephemeral TCP port. As a result the port number is less predictable, which is some small improvement -- but certainly not "random" in any strong sense.

NightMachinery commented 3 years ago

Can't the browser be made to ban any connections to the local subnet from outside of it? This seems the only real solution.

NightMachinery commented 3 years ago

It seems the browsers have indeed already solved this issue, as far as I can tell:

https://developer.chrome.com/blog/private-network-access-update/

Private Network Access (formerly known as CORS-RFC1918) restricts the ability of websites to send requests to servers on private networks. It allows such requests only from secure contexts. The specification also extends the Cross-Origin Resource Sharing (CORS) protocol so that websites now have to explicitly request a grant from servers on private networks before being allowed to send arbitrary requests.

Private network requests are requests whose target server's IP address is more private than that from which the request initiator was fetched. For example, a request from a public website (https://example.com) to a private website (http://router.local), or a request from a private website to localhost.

greghendershott commented 3 years ago

Although I'm always happy to review any security issue, I think the status quo situation wrt Racket Mode is good:

The TCP port number is "ephemeral", chosen by the OS. Of course this alone isn't "random" enough to be sufficient; more importantly...
The back end will terminate a TCP connection unless the client supplies the same token given to the back end when it was launched. (If an attacker can launch programs, that's the more basic vulnerability.) I think this is sufficient; but also...
Even if a client managed to write such a token, the TCP connection uses an s-expression protocol; sending an HTTP request causes the connection to be closed.

But nothing in security is absolute. For example the token in item 2 might be guessable. So although I believe Racket Mode isn't vulnerable to the kind of generic "confused deputy" HTTP attacks that opened this issue, it certainly could vulnerable to a tailored attack.

greghendershott commented 9 months ago

This is moot ever since commit 2522488 as of which we no longer use a TCP connection.

greghendershott / racket-mode

localhost + port isn't safe #327