kbase / project_guides

This repo contains documents and guides that describe project principles, how-to docs, etc.
MIT License
7 stars 33 forks source link

[WIP] SLA and POC #53

Closed kkellerlbl closed 7 years ago

kkellerlbl commented 9 years ago

Please don't merge yet!

This doc is a first pass at formalizing a Service Level Agreement (SLA) and Point of Contact (POC) procedures for reporting and responding to problems on devops-supported systems. Please leave comments here so we can make revisions to the doc before formally including it in project_guides.

kkellerlbl commented 9 years ago

One question I had is whether we would be able to ask the POC to marshal and perform non-emergency deployments in production. My initial thought is that planned deployments should be done as part of a sprint, and therefore a devops person should be on that sprint team to help organize what needs to be deployed and when. However, once that team gets to the deployment stage, if the POC is not busy putting out fires, he could be asked to help with the actual deploy. If there were POC issues those would take priority over sprint deploys.

My worry would be that if two sprint teams are planning for production deploys there could be contention for the POC's free time. I'm not sure how we'd want to resolve that.

kkellerlbl commented 9 years ago

It'd also be nice to have a calendar, maybe we can do a Google calendar. But that's not quite as important, especially if we pin in #poc who's the POC.

nlharris commented 9 years ago

Keith, this document looks reasonable to me.

fperez commented 9 years ago

This looks good to me, @kkellerlbl. I'd suggest you post to the kbase-all list so others can comment on this PR, so we can have a timeline for agreement/merge...

kkellerlbl commented 9 years ago

I think Shane and Dan wanted to discuss the SLA a little more before trying to solicit comments from the wider group. But yes, once that's done we should post to all before attempting to merge.

fperez commented 9 years ago

OK, great. I'd like a general policy of having any PR on this repo being given say ~3-5 business days after a post to kbase-all for review and feedback before a merge is attempted. It could be the low end of that for simple things, and the higher end for more potentially controversial things.

Since this is effectively imposing a hard contract on people's time commitments, I'd give it 5 business days of open discussion before a merge after you ping the list, so there's ample time for anyone to comment.

fperez commented 9 years ago

BTW, ping @dangunter and @scanon here since you're interested in their feedback :)

kkellerlbl commented 9 years ago

@danielolson5 not Dan G. :)

danielolson5 commented 9 years ago

We should be able to circulate a final version next week - please hold for now.

Dan

On Sep 25, 2015, at 9:48 PM, kkellerlbl notifications@github.com wrote:

I think Shane and Dan wanted to discuss the SLA a little more before trying to solicit comments from the wider group. But yes, once that's done we should post to all before attempting to merge.

— Reply to this email directly or view it on GitHub.

kkellerlbl commented 9 years ago

Would it make more sense to have changes that I'm waiting for input for before announcing on all to just be in a fork? This way a PR isn't hanging over the repo that's still waiting for a small group discussion before releasing to the wider group for comment.

fperez commented 9 years ago

I don't think it's a problem to have the PR open, you did mark it as not ready for merge yet... We do that all the time in IPython, it lets us use the same PR mechanism for everything.

On Fri, Sep 25, 2015, 20:03 kkellerlbl notifications@github.com wrote:

Would it make more sense to have changes that I'm waiting for input for before announcing on all to just be in a fork? This way a PR isn't hanging over the repo that's still waiting for a small group discussion before releasing to the wider group for comment.

— Reply to this email directly or view it on GitHub https://github.com/kbase/project_guides/pull/53#issuecomment-143393551.

fperez commented 9 years ago

I edited the title to [WIP], short for 'work in progress', so when we see the list of PRs we don't bother clicking on it if we're looking for things to merge, only if we want to review/comment.

Once it's ready to merge, we can remove that marker.

scanon commented 9 years ago

Dan: Are there any major changes you would like to make?

danielolson5 commented 9 years ago

I proposed some changes on kkellers fork.

pagerduty needs to be reworded to fit what we can setup with the nersc monitoring center.

Dan

On Oct 8, 2015, at 3:30 PM, Shane Canon notifications@github.com wrote:

Dan: Are there any major changes you would like to make?

— Reply to this email directly or view it on GitHub.

kkellerlbl commented 9 years ago

Dan, is that this commit on your fork?

https://github.com/danielolson5/project_guides/commit/dfdca1d100a8b550ce809f073d63c6762d7ddb45

danielolson5 commented 9 years ago

Yes, thats it.

Dan

On Oct 8, 2015, at 11:17 PM, kkellerlbl notifications@github.com wrote:

Dan, is that this commit on your fork?

danielolson5@dfdca1d

— Reply to this email directly or view it on GitHub.

fperez commented 9 years ago

@kkellerlbl, could you review/merge Dan's commits into your branch? That way, we can include them into this PR directly and make them part of the discussion...

kkellerlbl commented 9 years ago

I just merged Dan's changes.

I would like to try to keep some sort of reference to Slack in there somehow, but I know Dan is less excited about using Slack for the POC role than I am. OTOH I'm fine pulling this in now and revisiting the Slack question later.

fperez commented 9 years ago

Up to you, folks, this should reflect your sense as a team :)

kkellerlbl commented 9 years ago

Okay, thanks Fernando. @danielolson5 if you're okay with it you can merge yourself or ask me to do.