Get Travis to run full integration tests

iliana commented 7 years ago

Travis should compile and run all examples in AWS Lambda and verify output on pull requests.

This is, of course, questionable from an AWS account security perspective. Some thoughts on mitigation:

Use a separate AWS account entirely
Create IAM execution roles in a separate stack, manually maintained by the account operator, and referenced by the stack containing functions
If possible, restrict the resources the Travis IAM user can create via CloudFormation (probably only Lambda::Function resources)

euank commented 7 years ago

Driveby thought dump:

An obvious problem to solve is storing the iam secrets.

Travis

Travis has a concept of encrypted secrets. However, these are secure by only functioning for PRs that aren't coming from forks. I assume that rules them out entirely.

That's not to say Travis is ruled out, just relying on their encrypted secrets.

Jenkins

Jenkins has a slightly nicer concept that almost works but is a pain to setup. Jenkins secrets can be made available during PRs, but a set of files can be marked as "sensitive" and are taken from master instead of from the PR. This set of files is the Jenkinsfile (for obvious reasons), but would also be configured to be all other files executed at any point during stages that have access to the secrets. This would allow PRs to change the example code, but not e.g. a bash script that uploaded the code to lambda and needed said secrets.

The above is meant as an aside though because I can't seriously recommend using jenkins and the above behaviour is finicky at best to configure.

I think that an easier solution than either of the above though is simply not exposing IAM secrets at all. They have the unfortunate problem of, well, being secrets. An alternative would be to simply not have a travis IAM user. Instead you could have some magic endpoint which contains the IAM user and takes a lambda tarball to run, which it runs and proxies back the output of.

That has a nice property that the IAM user will certainly never be exposed and, since the CI configuration for the actual running isn't exposed (only "upload to this url and wait", not "do these aws operations"), it's much simpler to reason about.

The resulting attack surface is basically that arbitrary people on the internet can now run arbitrary code. This isn't easy to fix perfectly, but it can be massively improved by requiring the request to include the github PR or travis build id, which can then be verified to be valid and used for rate limiting.

As a bonus, it's possible that endpoint could be a lambda function using crowbar to do all the above :)

softprops commented 6 years ago

I have some experience with Travis here. Im mostly used to provisioning IAM user credentials for deployment with serverless framework apps ( crowbar included ).

I think this would be doable with private Travis env vars. Travis takes measures to restrict what access to forks have to those. Iam is also designed specifically for restricting what you can do with them should you gain access to them. I think Travis runs would be doable.

We might find some inspiration in the rusoto project. I believe it does something similar for integration tests

euank commented 6 years ago

Last I looked into it, private travis variables couldn't be used for pull requests from non-org members. The above comment I wrote was with the assumption that testing arbitrary pull requests is a goal, and that we can do it safely enough, but that travis might not be the right vehicle.

Has it changed so that you can have travis include such secrets in non-org-member PRs?

softprops commented 6 years ago

You're right. This is looking like not a good option :/ https://docs.travis-ci.com/user/environment-variables/#defining-encrypted-variables-in-travisyml https://docs.travis-ci.com/user/environment-variables/#defining-variables-in-repository-settings

euank commented 6 years ago

I think a satisfactory solution could be a ci-setup of the following:

Travis CI compiles and tests everything it can without touching AWS
Travis locally checks environment variables etc to verify it's on this repo (either a PR or master), and if not bails, passing with just 1 passing.
Travis uploads the compiled artifacts from 1 to a hardcoded endpoint we run (api-gateway -> lambda).
Said lambda verifies the artifacts are associated with an open github PR, marks in a rate-limit dynamodb table that it's running a test for the PR, uses its own private github token to mark the PR as "test in progress"
The lambda spins off all the other lambda functions that need to be tested and includes them in an AWS Step Function so that a final "publish results" lambda is run after the example ones have all exited.
Step function finishes, publish lambda runs and uploads an html file of results to s3, and then uses that s3 URL as the github result status... it lets github know to mark it as green or red and provides the s3 url.

Some nice properties of the above:

Only travis + lambda, so we don't have to manage any servers / jenkins-workers / whatever
IAM and github secrets are stored in a lambda and we control said lambda; no chance of anyone accessing them modulo serious bugs in the first lambda described.
It can all be in crowbar and thus dogfood!
If we're lucky it might fit in free tier

Some downsides:

We have to write a decent chunk of code for the above to work
It could be abuseable since anyone can open a PR, upload a lambda zip claiming to be from that PR, and get a bit of free compute (but who would use 128MB lambdas to mine litecoins, also rate-limiting should help)
It might not fit in free tier.

I think the biggest downside is that we have to build this bespoke system to use it. I could maybe have a swing at it, but I'd like to first air the idea and see if anyone has a better idea or can poke holes in this one.

This is basically the same idea I was expressing in my previous comment here, just fully fleshed out.

iliana / rust-crowbar

Get Travis to run full integration tests #14

Travis

Jenkins