Open karalabe opened 7 years ago
This is just a drive by comment to add another alternative to the section that begins:
From Go's perspective, a possible solution...
Equally we could continue to support unicode import paths but enforce that the user acknowledge the punycode equivalent:
package main
import (
"fmt"
"gοlang.org/x/crypto/ssh/terminal" // xn--glang-rce.org/x/crypto/ssh/terminal
)
func main() {
fmt.Println("Please enter a secret:")
terminal.ReadPassword(1)
}
As I said, this comment is not per se a vote one way or the other...
A total aside, GMail routed your original email to the Spam folder @karalabe with some obscure message that I basically interpreted as "suspicious homographs"
Hah, nice :)
Btw, I think your comment proposal is really nice. Retains all current functionality, but protects users. We could also make gofmt expand to that by default and then problem's solved.
The gofmt
expansion idea strikes me as a very reasonable solution.
See also #20209. TL;DR, I think we should solve this in code review tools, not the language.
On hold for #20209.
As per #20209, the fix might be in the tools instead of the language. For example, we could prohibit (not merely swap out) "go get" from following IDNs (Internationalized Domain Names).
I would then invert the suggestion above so that the 'pretty' IDN would be in the (optional) comment, and the unambiguous ASCII name be in the import path. For example:
import "xn--glang-rce.org/x/crypto/ssh/terminal" // gοlang.org/x/crypto/ssh/terminal
What is/was the outcome of this 🧵? I would love to use proper emoji-only (no letter mixing) domains like http://🦭.st in an import path...
If you take a look at the following snippet https://play.golang.org/p/SpXVFRLUlX, it will look completely benevolent. It just pulls in the terminal package from the extended stdlib and read the user's password.
If you copy paste this snippet into a file and try to run it, you will get an error message along the lines of
cannot find package "gοlang.org/x/crypto/ssh/terminal" in any of: [...]
. The standard "reaction" from a user to this error message will be either to rungo get
or perhapsgo get gοlang.org/x/crypto/ssh/terminal
. And this is where things can go horribly wrong.Thing is, the "gοlang.org" domain in the snippet's import path is not ASCII, rather Unicode. All modern URL libraries (including the one used by go and go get) will helpfully convert this URL to punycode, mapping the "gοlang.org/x/crypto/ssh/terminal" package silently to "xn--glang-rce.org/x/crypto/ssh/terminal". At this point, Go will go and fetch whatever package is at that import path, on a completely different domain that the user expected.
To fully demo this attack I've tried to actually register that domain, but the top registrars refused. There are plenty shadier ones that seem to be happy, however I'm not sure I want to give out my credit card to so many companies only to see if I can get the domain registered or not.
For the purpose of this demo, let's assume that I did manage to register it, and instead simulate my doing so by manually resolving xn--glang-rce.org to 130.211.147.254. You could do this by adding an entry to your hosts file. On Ubuntu this would be adding
130.211.147.254 xn--glang-rce.org
to /etc/hosts and possibly also enabling this in the network manager via addingaddn-hosts=/etc/hosts
to /etc/NetworkManager/dnsmasq.d/hosts.conf and doing asudo service network-manager restart
). Make surenslookup xn--glang-rce.org
works and results in 130.211.147.254 before continuing.Ok, now that we have our domain "registered", we could actually try to to run that playground snippet I provided via
go get --insecure && go run snippet.go
. Note I'm doing insecure since we only have a "simulated" domain, but a real one can have cloudflare or let's-encrypt in front of it, so the HTTPS part at this point is irrelevant.The program output will be:
It should be pretty obvious by now what happened here. Go get resolved silently my unicode domain into punycode, it reached out to wherever that was hosted and downloaded something that for every human looked like a trusted repository. In this particular case, I was running a vanity import path resolver at the above IP, which resolves this single import path to https://github.com/karalabe/mitm, a demo attack package that simulates a mitm attack for ssh passwords.
In my opinion this is a horribly dangerous social engineering attack. All it takes to break a project (open source are vulnerable the most), is to send a patch to a random project on GitHub, and beside fixing whatever to hide the attack, also helpfully reorg the imports, swapping one of the paths out with a custom homoglyph domain. The domain doesn't even have to attack immediately, it could just lay dormant there redirecting to the real import until someone decides to arm it.
This could allow arbitrary code to be injected at an arbitrary later point in time into a project without anyone being the wiser. For certain projects (e.g. Ethereum, which has its majority client written in Go) this is an end game scenario.
From Go's perspective, a possible solution against this attack vector could be to modify gofmt to swap out Unicode import paths to the final punycode variant
go get
would download anyway. This could provide a strong enough "hint" to developers that there's something very wrong with an import path without requiring on outside tooling. Beside this, I'd also venture to suggest that the compiler could be modified to reject any import paths that resolve into punycode but that are represented as Unicode in the source.On the down side of course, my suggestion would be equivalent to dropping support for all internationalized domain names, alas the question is whether it's worth compromising the entire ecosystem for fancy import URLs.
Apparently "golang.org" is not so easily spoofable since there's a limitation in IDN domains that only characters from a single character set may be used (at least for the popular TLDs) and the "g" character saves the day, being fairly unique to Latin. However there are other interesting domains that can be fully represented in Cyrillic for example which require a single character set and thus pass all domain verification (e.g. "огео.com" actually being "xn--c1aezc.com", free to register at godaddy https://www.godaddy.com/domains/searchresults.aspx?checkAvail=1&tmskey=&domainToCheck=xn--c1aezc.com).
Given this constraint, the attack surface is much much smaller than I originally anticipated (most Go packages are hosted on github, which should arguably be harder to spoof), but there's still a potential to break future vanity addresses (e.g. if oreo decides to release a Go package in 5 years). Then again https://www.аррӏе.com