google / starlark-go

Starlark in Go: the Starlark configuration language, implemented in Go
BSD 3-Clause "New" or "Revised" License
2.34k stars 212 forks source link

Question: Prevent long-running scripts #160

Closed bradrydzewski closed 4 years ago

bradrydzewski commented 5 years ago

Hi there. Is it possible to constrain execution time and resources? I am interested in embedding starlark in a server application that processes user-defined scripts and was interested if there were steps that could be take to mitigate long-running or malicious code.

For example to prevent something like this:

def loop():
  x = '';
  for i in range(1, 100000000):
    x = x + 'Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do ...'

loop()
alandonovan commented 5 years ago

Is it possible to constrain execution time and resources?

Not within the same process. It is trivial for a Starlark program to consume all time and memory, and to cause stack overflow. Even without recursion enabled, it's possible to get the interpreter stuck in an endless loop, for example by creating a cyclic value such as a list that contains itself and then applying built-in recursive operators to it. The str(x) and x==y operators try to defend against this, but the defense is imperfect, and there are likely other operators, including application-defined ones, that are not so defensive.

There are also likely to be ways in which the author of a Starlark script can cause a panic in the interpreter, or worse, a fatal error. I regard all of these as bugs, but they may be hard to find and hard or even impossible to fix. For example, imposing constraints on memory would essentially require a completely separate memory allocator and garbage collector for the Starlark heap, which would complicate the API and implementation and make it much less useful as an embeddable language.

In short, if you're designing a networked server that accepts Starlark scripts from clients, you are trusting the health of your server to those clients. So this might be a reasonable design for dynamically updatable configuration (e.g. a request filter) where the scripts are and the server code are written by the same author, but not for any scenario with potentially hostile clients.

The best way to defend against all these problems is to use a separate POSIX process for the interpreter. Of course, that has overheads (no sharing of memory) but the interpreter starts very quickly, and you can easily bound the time (by having the parent kill it after a timeout) and memory (using ulimit) of the process, and if it crashes or fatals you can simply clean up and move on.

This is a great question----I should add this to a list of FAQs.

tv42 commented 4 years ago

Related and/or duplicate: canceling execution via a Context (or other mechanism): https://github.com/google/starlark-go/issues/236

alandonovan commented 4 years ago

FWIW, we are likely to add support for per-thread limits on virtual instruction counts, so an application can bound the number of abstract computation steps done by a program. The (unspecified) measure won't exactly correlate with CPU time, but it has the virtue of being deterministic and reproducible.

alandonovan commented 4 years ago

Fixed. See Thread.SetMaxExecutionSteps.