ministryofjustice / operations-engineering

This repository is home to the Operations Engineering's tools and utilities for managing, monitoring, and optimising software development processes at the Ministry of Justice. • This repository is defined and managed in Terraform
https://user-guide.operations-engineering.service.justice.gov.uk/
MIT License
12 stars 5 forks source link

Audit tools for GitHub repo individual contributors (remove random individuals given admin on one repo and not org member unless says they should have access until ) #23

Closed digitalronin closed 3 years ago

digitalronin commented 4 years ago

Manage our github external collaborators in code so that we a) don't run out of slots, and b) know why, when and by whom a collaborator was added, so that we don't have lots of people with access to repos who shouldn't have it (because the project finished, or they moved on, or whatever)

digitalronin commented 3 years ago

I just started to have a look at this. Step 1 is to scan our existing repos and find all the external collaborators. I think graphQL similar to that described in this thread should do what I need.

Unfortunately, I can't run the query because I don't have enough permissions:

 "message": "Your token has not been granted the required scopes to execute this query. The 'permissionSources' field requires one of the following scopes: ['admin:org']

To make progress on this, I can see a few ways forward:

@SteveMarshall what do you think?

digitalronin commented 3 years ago

Relevant documentation links:

SteveMarshall commented 3 years ago

@digitalronin I see no reason why you shouldn't have org-admin rights, honestly, so I've granted you those. Use them wisely, and let me know if/when you're happy to relinquish them (or you can revoke your own access).

There are a couple of other things that are probably outside the scope of this particular issue, but I'd like to see tackled overall as a result of this work, too:

digitalronin commented 3 years ago

I've made some progress on this. Code is here. I'll pick this up again next week. The next step is to iterate through all our repos and output a list of collaborators with access who are not members of the org.

digitalronin commented 3 years ago

For reference, there is some great work in managing github via terraform in this repo which we should ~totally rip off~ be heavily inspired by.

digitalronin commented 3 years ago

FYI, the initial list of the 313 github accounts who are not members of the ministryofjustice organision, but who have some access to our repositories is here: https://github.com/ministryofjustice/operations-engineering-github-collaborators/blob/main/non-org-collaborators.json

digitalronin commented 3 years ago

This is now running once per day, and posting data here

Data comes from here, and the posting is done via this github action

I think that's enough work on viewing the status of collaborators, so for the next phase I'll move on to creating terraform code to set collaborators, and some other code to remove any collaborators which are not defined in the terraform state file.

digitalronin commented 3 years ago

I need to add to the report so that it shows what permission the external collaborator has on the repository.

This will be necessary when it's time to import the existing collaborators and define them as terraform code: https://www.terraform.io/docs/providers/github/r/repository_collaborator.html#import

digitalronin commented 3 years ago

I've made more progress on this, adding code to import existing repositories' collaborators both as terraform code here, and into the terraform state for the whole repository.

So, the repo should now match the current state of this report.

The next steps are:

One thing I'm not sure about is how to get people to annotate the existing collaborators. This is both in a formatting sense - where and in what form should the information be added - and in a "how to get everyone to do it" sense. @AntonyBishop any ideas?

digitalronin commented 3 years ago

I've added github actions to the repo to do a terraform plan when a PR is raised/updated, and a terraform apply when it's merged to main.

Next step is to write the code to look for discrepancies between github and the terraform state, which will identify changes people have made outside of the code in the operations-engineering-github-collaborators repository terraform code.

digitalronin commented 3 years ago

I think I've figured out a reasonable way to manage getting people to input information about pre-existing collaborators, and moving towards an auto-delete system for "orphaned" collaborators.

I've amended the code so that each collaborator needs to be specified with:

So, I'm going to amend this report to add some status information about the collaborator, and to only list collaborators with problems, where problems include:

This way, the report becomes a todo list that we can a) point people at and they can easily see where information is missing, and b) use as the basis for an automatic deleter to remove collaborators where the information hasn't been provided, or hasn't been updated recently.

digitalronin commented 3 years ago

Issues (i.e. problems) with collaborators are now being included in the JSON data by the reporter, and displayed on the web application:

Screenshot 2020-11-30 at 18 04 40

So, I think we're ready to announce this to the organisation, and give them X weeks to fill in the missing information before we start deleting collaborators.

digitalronin commented 3 years ago

This is currently blocked waiting for a decision on our data privacy impact assessment. @AntonyBishop is in contact with the relevant team.

digitalronin commented 3 years ago

Now that we've heard back about the DPIA, I've gone ahead and posted comms in slack, giving folks until 2021-01-11 to fully define the collaborators they need.

digitalronin commented 3 years ago

After 2021-01-11, we should also restrict who can invite collaborators to our org: https://docs.github.com/en/free-pro-team@latest/github/setting-up-and-managing-your-[…]ing-repository-management-policies-in-your-enterprise-account

Then, any collaborations will have to be created via PRs to the collaborators repo.

digitalronin commented 3 years ago

@AntonyBishop

I've added last commit date to the collaborators report so we can see if a given external collaborator is currently active before we delete them.

I've also added anchor tags, so we can send people URLs linking to specific repos, like this:

https://operations-engineering-reports.cloud-platform.service.justice.gov.uk/github_collaborators#bichard7-next

(To grab the URL for a repo, click the </> next to it's name, and then copy the URL from the browser address bar)

Question: What counts as "currently active"?

i.e. how recently does someone need to have committed to a repo for us to chase them rather than just remove them?

I'm thinking maybe 6 months, which would mean if they last committed before 2020-07-11, they just get removed.

What do you think?

AntonyBishop commented 3 years ago

Thanks @digitalronin I think 6 months is also a reasonable period of time. We can review dependant on any feedback we get.

digitalronin commented 3 years ago

The first iteration of this is done. I'll raise a ticket (#104) for the next steps, but those can wait a while.