peter-murray / inactive-users-action

GitHub Action for generating a report on user activity in a GitHub Enterprise Organization.
MIT License
70 stars 32 forks source link

Can we generate a csv file with all the users and their last "activity" within a period? #4

Open stingaa opened 3 years ago

stingaa commented 3 years ago

Thank you for this action. Almost exactly what I needed. However, I want to recoup only a certain number of licenses to minimize impact. Say I want to harvest 100 licenses, but when I run get inactive users with say 90 days, I get 500 users. This took 4 hours. If I run again with 180 days, I get 350....etc.

I would rather just get a CSV report with inactive users and their last active date and type of activity and sort them by date and see where the cut-off date should be for a certain number of inactive users.

Is that possible?

peter-murray commented 3 years ago

It might be possible, but the data required to identify user activity does not come from the users themselves, but each repository within the organization. The reason for this is that on GitHub Enterprise Cloud, the user data does not belong to the organization, but the users, so it is not possible to start this process from looking at the users directly.

In theory it would be possible to generate a more detailed break down of the activites and dates for this, but ultimately it is going to take a similar amount of time as that data needs to be collected from the organization, repository perspective and it will very likely hit your API rate limits in busy/active organizations with potentially many users.

I will take a look at how this might be possible to expand the functionality to give an alternative view on the data, more in line with what you are looking for.

stingaa commented 3 years ago

We are thinking of having a org webhook sending all events to a service that would parse and store activities for every account in that org. That would give us most actions that modify something in our org. However, that leaves a big gap of not knowing if users logs in frequently to read but never touch anything. Like managers to reviews things in Github.

Also it is very surprising to me that org security API doesn't show login info. How we do know if someone who isn't part of AA logs in to our Org? Or some employee accessed our Org after they left AA? I would expect every logon with IP or something in our security log.

peter-murray commented 3 years ago

There is some information available for SSO backed users via the GraphQL APIs under the various IdP bindings.

When it comes the GitHub.com, privacy and ownership of identifying data lies with the user. That is why GitHub now has Enterprise Managed Users as an option for Enterprises to adopt if they want more control over the users data themselves and enforces the SSO IdP as the only source of truth for access.

If you are using SSO backed users today (and prevent external collaborators), and enforce it at the organization or enterprise, or adopt EMU backed Enterprise, then you can ensure that anyone inside your repositories can only come from your IdP, so this would easily take care of the user lifecycle for Joiner, Mover, Leaver.

GitHub on cloud does have this information but it is not able to be easily shared via standard APIs due to privacy, but you can request this via a support ticket if need be.