golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
123.97k stars 17.67k forks source link

proposal: cmd/go: permit go.mod to specify capabilities for dependencies #50632

Closed cugu closed 2 years ago

cugu commented 2 years ago

Background

Recently different attacks (e.g. typosquatting, dependency-confusion) and other issues (e.g. npm colors and faker, log4j) related to dependencies highlighted a quite severe problem with dependencies:

Every imported package gives that package's author the same capabilities to the system as the "parent" software. This includes remote code execution for your software.

A video on WASI brought me to the idea how beneficial to security it would be if we could pass capabilities to dependencies.

Proposal

Give capabilities for critical actions (e.g. file access, execution, network access, ...) to dependencies to reduce the possible attack surface for malicious dependencies.

In a go.mod file this could look something like this:

module github.com/cugu/mymodule

go 1.20

require (
    github.com/go-chi/chi       v5.0.7   (networkRead)
    github.com/mattn/go-sqlite3 v1.14.10 (fileRead, fileWrite)
    github.com/sirupsen/logrus  v1.8.1   (osStdout)
    github.com/yuin/goldmark    v1.4.4   ()
)

chi would be able to receive network requests, go-sqlite3 would be able to read and write files and logrus could write to stdout. But also all those modules would be limited to those capabilities and, for example, the logging library logrus would not be able to interact with files, the network or execute code.

For failure there are two options:

Option A: Fail to compile

Go could fail to build when the capabilities are not met. In this case older dependencies would force you to just accept their required capabilities.

Option B: Fail at runtime

Go could also fail at runtime and e.g. a call to os.Open without the fileRead capability would result in an error.

Rationale

Malicious dependencies would be much less critical in many cases as a potential attacker would have only limited attack surface besides stealing your CPU cycles.

Also package maintainers would be more aware of critical imports and might reduce those.

Open issues

For included C or assembler code it would be hard to detect to fine grained capabilities.

deanveloper commented 2 years ago

Would there be a way for modules to specify which permissions they need? It would be quite annoying to install modules that need permissions without a feature like this.

Also, consider the following code:

package mod1

func Foo(f func()) {
    f()
}

// ========
package main

import "mod1"
import "os"

func main() {
    mod1.Foo(func () { os.Open("file.txt") })
}

Would the "mod1" module need fileRead permissions in order for this to work? Also, certainly something like this couldn't be caught at compile-time. I think there'd need to be runtime checks, along with go vet or something which can flag potential permissions issues.

cugu commented 2 years ago

Would there be a way for modules to specify which permissions they need?

No, the idea is to have a fixed set of capabilities and infer them automatically.

Would the "mod1" module need fileRead permissions in order for this to work?

Interesting example. IMHO: No mod1 would not need fileRead capabilities, but main would. Passing types with functions, interfaces or functions to a dependency would then also mean passing its capabilities. This can be easily detected at compile time. Passing the capability to mod1 is something that linters could detect as a possible security issue.

ianlancetaylor commented 2 years ago

Any protection scheme needs a clear definition of the threat model it protect against.

It's important to observe that the nature of Go, and the possibility of creating race conditions, and the fact that all Go programs must make a range of system calls, means that malicious code can do just about anything even if capabilities are not granted to it. That is, the Go language and tools do not support real protection against other code running in the same address space.

So it seems to me that the threat model here is something like accidental use of networking facilities by a module that is not intended to use networking. Or something like that. Is that a real problem that needs to be addressed?

cugu commented 2 years ago

Any protection scheme needs a clear definition of the threat model it protect against.

The proposed measure would help in security (intentional attack) as well as safety (vulnerabilities).

intentional attacks

An intentional attack consist of a compromise of a dependency and injection of code via this way. The result would be arbitrary code execution in context of the main program.

It's important to observe that the nature of Go, and the possibility of creating race conditions, and the fact that all Go programs must make a range of system calls, means that malicious code can do just about anything even if capabilities are not granted to it.

Don't all dependencies need to go through the standard libary to perform system calls? The only exceptions here would be the use of C and assembler code which can be detected as well.

vulnerabilities

So it seems to me that the threat model here is something like accidental use of networking facilities by a module that is not intended to use networking. Or something like that. Is that a real problem that needs to be addressed?

That is pretty much what happened with the log4shell vulnerability. A logging library had the capabilities for arbitrary code execution. Money quote from the wikipdia article: "Experts described Log4Shell as the largest vulnerability ever".

mvdan commented 2 years ago

I'm not sure how useful such restrictions will be unless they are well supported by operating systems. For instance, on Linux, having access to the filesystem already gives you access to practically everything; you've got lots of devices under /dev and you can inspect the state of the system via /proc and /sys.

cugu commented 2 years ago

I'm not sure how useful such restrictions will be unless they are well supported by operating systems. For instance, on Linux, having access to the filesystem already gives you access to practically everything; you've got lots of devices under /dev and you can inspect the state of the system via /proc and /sys.

True, this is why I consider file access a capability. Most packages do not need file access and also should not have file access.

mvdan commented 2 years ago

I'm not sure I agree that most modules won't need file access in practice. I think the majority of modules will fall into at least one of these categories:

Here's an example: https://pkg.go.dev/net/http#Request.ParseMultipartForm

ParseMultipartForm parses a request body as multipart/form-data. The whole request body is parsed and up to a total of maxMemory bytes of its file parts are stored in memory, with the remainder stored on disk in temporary files.

thepudds commented 2 years ago

Hi @cugu, to expand slightly on the data race piece of of Ian’s point:

It's important to observe that the nature of Go, and the possibility of creating race conditions, and the fact that all Go programs must make a range of system calls, means that malicious code can do just about anything even if capabilities are not granted to it.

If you haven’t already read this on how data races can be exploited by malicious code, this is a classic that is well worth the read:

https://research.swtch.com/gorace

Go previously supported nacl, a sandbox for native code, but it was removed in part because it was not effective against modern techniques used by malicious code (in addition to reasons around maintenance burden and so on):

https://github.com/golang/go/issues/30439

deno is some prior art for a runtime optionally restricting access to things like the network or the file system:

https://deno.land/manual@v1.17.3/getting_started/permissions

where part of the way that is achieved is multiple layers of abstraction between the code that’s running and something like a syscall, which has architectural implications and performance implications:

https://docs.google.com/presentation/d/1LYNGpyjx9PemL-P__7hVC8mSqkX-jL8VQLMhCRehy00/edit?usp=sharing

In any event, my main point is doing something meaningful here against malicious code I think would need much more than you outlined.

cugu commented 2 years ago

@thepudds thanks for the elaboration and the links!

I did not know of how to abuse the races and the overall article is a good read on the whole topic. This is the first example I see where the standard library does not need to be involved and which would be hard to detect.

Also it's interesting that deno has a similar features. Though it looks to be only for the complete software not single dependencies, if I get it correctly.

I think supply chain attacks are here to stay and it would be great if programming languages provide measures against those.

In any event, my main point is doing something meaningful here against malicious code I think would need much more than you outlined.

I'm interested in other peoples take on this ;-)