haskell / play-haskell

Haskell Playground
129 stars 8 forks source link

Playground API #14

Closed Kleidukos closed 1 year ago

Kleidukos commented 1 year ago

Loosely based on the same model as the Rust Playground's API, we can have an API to submit code to be evaluated and get the result back.

The payload would look like this:

{ code: "…",
  version: "ghc-9.6",
  optimize:  "0 | 1 | 2",
  output: "stdout | core | asm | js | wasm (last two when applicable)",
}

Its usage would be integrated to the spam score.

tomsmeding commented 1 year ago

To a certain extent this API already exists, but not quite in the form you describe:

$ curl https://play-haskell.tomsmeding.com/versions
["8.6.5","8.8.4","8.10.7","9.0.2","9.2.5","9.4.4","9.6.0.20230128"]

$ curl https://play-haskell.tomsmeding.com/challenge
feiSjSVfmvzg

$ curl -d '{"challenge":"feiSjSVfmvzg","source":"main = print 42","version":"9.4.4","opt":"O1"}' https://play-haskell.tomsmeding.com/compile/run
{"ec":0,"ghcout":"","sout":"42\n","serr":"","timesecs":1.555203842}

$ curl -d '{"challenge":"feiSjSVfmvzg","source":"main = print 42","version":"9.4.4","opt":"O1"}' https://play-haskell.tomsmeding.com/compile/core
{"ec":0,"ghcout":"","sout":"\n==================== Tidy Core ... Main.$trModule1\n\n\n","serr":"","timesecs":1.832814564}

The challenge refreshes every 24h; the point of that is to prevent stray POST requests from misconfigured crawlers. How high this risk actually is, I don't know, but I recall being recommended to do this by someone.

Kleidukos commented 1 year ago

@divarvel @frasertweedale We could really use your input on the subject of the API challenge token

frasertweedale commented 1 year ago

It doesn't give you any defence against a threat actor, but maybe it's reasonable for thwarting wayward crawlers? No idea what the most reasonable rotation delta would be, though.

tomsmeding commented 1 year ago

I don't think it's even possible to defend against threat actors here; the playground web interface needs to be able to start jobs, and anything that the website can do, anyone can do. So indeed the only point is to prevent misconfigured crawlers from submitting wayward jobs.

I guess if no-one has had issues with this happening than it isn't important to have this "protection".

divarvel commented 1 year ago

Yeah, if the token delivery API does not perform any sort of checks whose results need to be remembered across requests, i don't see the value, outside of preventing crawlers from trigerring jobs (but the same effect could be achieved in various ways).

One possible improvement would be to have something akin to CSRF protection. This would tie code execution requests to the actual page that displayed the code input.

tomsmeding commented 1 year ago

@divarvel What would CSRF protection actually do here? There is no concept of an account here, so I don't see what the difference would be with the current "challenge" setup. But I'm probably missing something because I have zero experience with CSRF tokens. :)

divarvel commented 1 year ago

it would not be a "real" CSRF token is the sense that it would not be there to prevent cross-site requests, but the implementation would be close to how CSRF tokens are usually implemented: the HTML payload contains a token that is used in requests emitted from the page. So yeah since there is no concept of user, the only thing this achieves is making harder to send a code execution request without actually displaying and interpreting the page.

Since there is no authentication phase, any solution using tokens will only be there to make automated calls harder, but won't give any proper security guarantee.

In that sense, IP-address based quotas would bring more guarantees (but not much against the aforementioned misbehaving crawler, since they tend to already use IP address pools to evade restrictions anyway).

My suggestion would be to spend effort on monitoring first, and then put restrictions in place that protect against actually observed issues.

tomsmeding commented 1 year ago

My suggestion would be to spend effort on monitoring first, and then put restrictions in place that protect against actually observed issues.

I guess this is the right answer. I do have IP-address based spam protection already, so I guess it's unclear whether the current challenge system actually solves anything.

Now the monitoring could also be improved...

Anyway, the conclusion is that an API as specified by @Kleidukos would be fine.

tomsmeding commented 1 year ago

With https://github.com/tomsmeding/play-haskell/commit/e85dd89ba2f4a31b42d5a66c4dcf79a125a20e63 and https://github.com/tomsmeding/play-haskell/commit/989cdc3ee2cd56f3b066d52eecad6dfe17f5c6db I believe this issue can be closed. @Kleidukos?

Kleidukos commented 1 year ago

@tomsmeding yes