deis / builder

Git server and application builder for Deis Workflow
https://deis.com
MIT License
40 stars 41 forks source link

make builder read keys from database #122

Closed smothiki closed 8 years ago

smothiki commented 8 years ago

Builder reads keys from etcd. We should avoid doing that and read keys from database

arschles commented 8 years ago

:+1:

Ref #81

rimusz commented 8 years ago

:+1:

arschles commented 8 years ago

Proposed API contract:

Call from Builder

POST /v2/hooks/key

{"app": "$APP_NAME", "public_key": "$PUBLIC_KEY"}

Response From Controller When Key is Valid for App

200 OK

Response from Controller When Key is Valid for App

404 UNAUTHORIZED

All Other Responses From Controller

500 INTERNAL SERVER ERROR

cc/ @helgi

helgi commented 8 years ago

Where are you going to get the public_key from?

https://github.com/deis/builder/blob/master/rootfs/etc/confd/templates/authorized_keys gets it from etcd and that's what we are solving for here.

I'm thinking you may want

GET /v2/hooks/keys/$APP_NAME

{"users": {"aaron": "keysandthings"}}

That'd get you keys for the given app, unless you have an alternative way to get the keys from?

smothiki commented 8 years ago

wouldn't it be nice if controller updates builder whenever there is a new user key ? Some mechanism where controller and builder talk to each other might be RPC or simple HTTP server inside builder that listens for keys and controller calls that API endpoint whenever some one does deis keys:add . Also for app deletion

smothiki commented 8 years ago

This avoids builder dependency on database and k8s API but coupling builder with workflow

helgi commented 8 years ago

That seems too complicated - the builder should never talk to the DB, ever. I can see your point about creating the tight coupling between the two but running a HTTP server on the builder creates unnecessary state (IMO).

If we go down the route of tightly coupling things up between the two then we need to discuss the implications of that, same way we do have to consider / discuss why we wouldn't have each component depend on the k8s API (more and more rather than less and less).

arschles commented 8 years ago

@helgi I misspoke in the payload that I posted above (the result of doing too many things at once). Yes, the builder w/o etcd wouldn't have a public key. I agree that the endpoint should look very similar to the one you proposed. However, I advocate for putting the user in the path, since a GET can't have a request body. The request would then look like GET /v2/hooks/keys/$APP_NAME/$USER_NAME.

I'd like to address some of the general comments in sections below, and I'd like to hear thoughts.

TL;DR my suggestion is to make the builder query the controller for both app existence and key validity.

Regarding the Controller Updating Builder vs. the Builder Querying the Controller

If the controller updates the builder, the builder has to hold more state. However, if the builder queries the controller (as described in the previous section), then state is isolated to the controller, where it already resides (more precisely, in the DB that the controller queries).

Managing state in a distributed system is challenging, so let's try to keep it as isolated as possible, and minimize the amount of state in the builder, in this refactor.

Regarding Tight Coupling Between the Builder and Controller

The builder already makes 3 RPCs to the controller on every build, and we've accepted the fact that the two components are tightly coupled and will be through the beta.

Adding a 4th RPC is not ideal from a performance perspective, but adding the RPC now and reducing them later is a manageable task.

Regarding Builder Accessing the DB

I 100% agree with @helgi that the builder should never access the DB. I'll expand - nothing but the controller should ever access the DB (see my comments above on isolating state).

Additionally, we've built an abstraction layer - the controller API - on top of the DB. If anything circumvents that layer, the abstraction is leaky and we will have introduced a large maintenance problem (imagine if many components are directly accessing the DB and we try to refactor the schema).

Regarding Listening to the Kubernetes Event Stream

I do agree that the event stream could be a good source from which to listen for app deletion events. However, if we do so, then we rely more heavily on builder state. Instead, why not rely on the aforementioned GET /v2/hooks/keys/$APP_NAME/$USER_NAME RPC to determine whether the app exists as well? If it doesn't exist, then we can delete it's repository folder.

I also disagree that the event stream is a good data source for adding and removing SSH keys (again, for builder state reasons).

I'm not clear if anyone is proposing that key management strategy above, but I'd like to make my position clear.

helgi commented 8 years ago

In my example the GET used $APP_NAME in the URL and the json below it was the return body where you got all users for the given app. Yours simply takes the one more step in the granularity and I'm fine with that.

If we use the API for app existence and user existence then you'd get a 404 either way. That's probably enough unless you want to provide more useful error messages.

I'm not seeing how much more state the builder would have to maintain if it gets app adds / removals from k8s API. Not via an event stream but rather label selectors. The state is owned by k8s and the builder is simply using it to see when it should for example clean up older apps.

If you rely on the API on git push to figure out if an app exists then when will you figure out what doesn't exist anymore? You are only ever querying against the API for a git push scenario or are you planning on expanding this further?

arschles commented 8 years ago

Thanks for the clarification, and I agree - it'd be nice to determine whether the app or user doesn't exist without having to check error messages.

I see what you're saying RE the Kubernetes API now, but regardless, the builder would still hold state - the state of polling - and we can get race conditions if the builder runs with more than one replica.

For example, say we're running 3 builders pods (replicas: 3) and pod A polls the k8s API and finds that myapp is deleted, but pods B and C don't. If a git push ends up on B or C, then it could succeed even though the app is actually deleted.

On the other hand, if builder makes the controller API call on each git push, then if the controller returns a 404, it knows that the app doesn't exist (using your version of the API).

I don't see a need for the builder to need to know when a given app exists at any other time. Am I missing something crucial there?

helgi commented 8 years ago

Are old apps ever left behind if you are not cleaning things up? Are checkouts always blown away so there is no old code / apps around on the builder pods?

arschles commented 8 years ago

In that scheme, old apps would be left behind if someone deletes an app and never tries a git push. That could be solved by a "reaper" goroutine that runs outside the critical path (and nothing in the critical path would be dependent on its state).

However, old code would be deleted when the RPC happens on a git push.

helgi commented 8 years ago

What I'm confused about is how the reaper determines what to delete and what to keep, without checking with something somewhere. It isn't end of the world having an active app reaped I suppose.

arschles commented 8 years ago

So all app checkouts live in the same directory (see here), so the reaper's job would be to walk that directory and query the controller for each app (at a slow interval).

Note that the app checkout directories and the reaper's state are expendable.

helgi commented 8 years ago

So instead of querying the k8s API you want to do the same against the controller API. I'm not really seeing the difference or benefit beyond perhaps not having to deal with the k8s API.

arschles commented 8 years ago

I'm fine with querying the k8s API, I'm just advocating for querying something before running the receive hook on each git push, possibly in addition to running it in the background.

Maybe we're saying the same thing...

helgi commented 8 years ago

I think we are ... So here is how I see it:

reaper can work off the Controller API as well, just another endpoint or you use the same endpoint as git push (sans username) to check each app you know about.

arschles commented 8 years ago

:+1: from me

:100:

arschles commented 8 years ago

Ref https://github.com/deis/workflow/pull/336 for the keys endpoint

smothiki commented 8 years ago

here are bunch of use cases I'm not clear about.

arschles commented 8 years ago

@smothiki I've posted answers to some of your questions below. I'm unclear on some others so can you please point out if I've left something unanswered?

Do we have to make an API call to get keys associated with the APP for every git push

Not necessarily. Since we know the app and the user, we could make the call to just get the key for the (app, user) pair.

wouldn't it be redundant most of the time

No, because multiple users may git push to the same app.

This check at git push will definitely hurt user exp

On what metric?

smothiki commented 8 years ago

An App is not associated with the key and user can have many keys .

arschles commented 8 years ago

Right, so the builder would make a call to the controller's /v2/hooks/keys/$APP/$USER endpoint, which would return back the keys for that user.

smothiki commented 8 years ago

let's say user has 100 apps running and uses 200 keys . The call returns 200 keys for an APP. Either we have to match the private key to these 200 keys or write the 200 keys to authorized key file and let ssh server figure out . If we are going with writing option most of the keys have already been written. we have to keep an indexing record of what keys are present and how to write efficiently

smothiki commented 8 years ago

My concern is about optimizing the git push operation if we are making a call to controller for each git push.

smothiki commented 8 years ago

Also practically the number of time git pushes are made are more than the number of times doing deis keys:add One way of Optimization is to check authorized keys file first and then making an API call for the rest of the keys.

arschles commented 8 years ago

The call returns 200 keys for an APP. Either we have to match the private key to these 200 keys or write the 200 keys to authorized key file and let ssh server figure out

Correct. There is no way around this, regardless of the method of acquiring keys from the controller.

My concern is about optimizing the git push operation if we are making a call to controller for each git push.

We already make 3 API calls to the controller for each git push operation. While that's not a good reason to add a fourth, it does show that builder <--> workflow RPCs are proven to work. As a sidenote, we should strive to reduce the latency, complexity and number of RPCs in a separate patch (I've created #144 for that).

Also practically the number of time git pushes are made are more than the number of times doing deis keys:add

I agree. Again, though, there's no way around this regardless of key delivery mechanism. The RPC method as proposed returns the full list every time, however, and that's non-optimal. Perhaps a future optimization would be to have the controller only deliver deltas.

One way of Optimization is to check authorized keys file first and then making an API call for the rest of the keys.

I agree. See my above comment on a possible future optimization.