cloud-gov / product

Program-level artifacts, workflow and issues for cloud.gov
Creative Commons Zero v1.0 Universal
31 stars 15 forks source link

Consolidate the two redundant/conflicting cloud.gov incident response pages #1051

Closed bengerman13 closed 2 years ago

bengerman13 commented 5 years ago

In order for an IC to be able to follow an orderly incident process, we need to have one single cloud.gov IR page instead of two pages (to accompany the 18F IR page).

Background information

We have 3 pages to consult for security incident responses.

They are in some cases redundant and in some cases contradictory, they contain circular references to each other, and in all cases it's a lot to look at.

We should either turn the checklist page into an actual checklist template, or merge it back into the incident response page

Follow-on

Run the IR exercise

2020-12-16

Resources:

2020-12-16 IR Incident Response Consolidation Notes in GDoc

https://sre.google/workbook/incident-response/

https://response.pagerduty.com/

Acceptance criteria

karareinsel commented 5 years ago

Our work is dependent on the work that is being done by TTS Infrastructure around incident response. Once they deconflict, then we'll be able to be added to the TTS plan. Blocked for now.

bengerman13 commented 5 years ago

blocked by 18F/tts-tech-portfolio#77

karareinsel commented 4 years ago

TTS has committed to having these de-conflicted by the end of January 2020.

pburkholder commented 4 years ago

Discussing with FedRAMP as part of our PMI letter: https://docs.google.com/document/d/15SPpgJtMkK3yK_Ya-b6VBo1OjG5Ttd1ZXloefX8AYRc/edit

pburkholder commented 4 years ago

@hillaryj Can you link here to the Mural you put together?

hillaryj commented 4 years ago

Mural of the in-progress drafting of an IR workflow diagram: https://app.mural.co/t/gsa6/m/gsa6/1578454101053/2be4f2baba395ba881353094f668cd4ef22559b8

hillaryj commented 4 years ago

Also some comments that I'm bringing to IR:

its-a-lisa-at-work commented 4 years ago

@hillaryj -- to clarify, when you mention contractors, you mean contractors supporting cloud.gov and not an application owner, correct?

hillaryj commented 4 years ago

Correct - we have contractors on our platform engineering team.

karareinsel commented 4 years ago

This will eventually need to be split into two pieces of work- TTS stuff and cloud.gov stuff. We're verifying the proposed workflow and continuing to refine. Probably 3-4 more sprints before we have a proposed workflow.

pburkholder commented 4 years ago

Bumping this onto the Project Planning board since we need to still run an IR exercise and our IR guides

pburkholder commented 4 years ago

Related: these notes from routes w/o routers follow up yesterday:

StatusPage has been helpful and when we have serious long-running issues, we pick a person to keep things communicated out and noise to the team down
We’ve been getting better about this, but need to be on top of:
    Notes doc started
    Incident process kicked off
    IC determined
pburkholder commented 4 years ago

Notes related to 🎃 Halloween activity