open-sauced / app

🍕 Insights into your entire open source ecosystem.
https://pizza.new
Apache License 2.0
381 stars 202 forks source link

Feature: Skunkworks - implement "Yolo Pushes" metric #3584

Open jpmcb opened 2 weeks ago

jpmcb commented 2 weeks ago

Suggested solution

Introducing the "Yolo Pushes" metric

Something we've identified as a powerful metric (and also a gap in our current offering) for repo-pages is when individuals in projects push directly to the main branch, bypassing the mechanisms we've built around contributors, lottery factor, and other metrics.

We call this "Yolo commits", "Yolo pushes", or "Yolo" coding (name and positioning pending ™️ ).


There's a new endpoint in the API that is currently being worked on that has the following format:

{
  // The number of pushes that have happened in the given time range (defaults to 30 days).
  // These the number of push events to the default branch that do NOT have a associated
  // pull request.
  "num_yolo_pushes": 3,

  // Pushes can contain MULTIPLE commits. I.e., someone could force push hundreds of
  // commits, overwriting all history from a git repo
  "num_yolo_pushed_commits": 6,

  // An array of the actual push events that do not have an associated pull request
  "data": [
    {
      // the GitHub login of the user who pushed without a PR
      "actor_login": "jpmcb",

       // the timestamp of when the push event occurred
      "event_time": "2024-04-09T06:49:40.000Z",

      // the sha of the head where the push occurred.
      // typically, this should be associated to a PR's merge sha. But in this case,
      // there is no merged PR sha that correlates to this sha.
      "sha": "3a1bc0949048c9e1fe48f8b8f8b0c8ef2e905764",

      // The number of commits actually pushed in this event.
      "push_num_commits": 2
    },
    {
      "actor_login": "bdougie",
      "event_time": "2024-05-09T06:46:54.000Z",
      "sha": "6e2f0bd7ee05b9f49bdb13bf5c2e8a5775cb0b3a",
      "push_num_commits": 2
    },
    {
      "actor_login": "brandonroberts",
      "event_time": "2024-06-09T06:39:52.000Z",
      "sha": "4a4557850cd311c573da7d79b4ff4a54b070c96f",
      "push_num_commits": 2
    }
  ]
}

This endpoint will be available at v2/repos/{org}/{name}/yolo and will have a range parameter that denotes how far back to look. TLDR: this endpoint will surface push events in repos, that can be surfaced in repo-pages, that do NOT have a correlated pull request.

Why is this useful?

As with all our metrics, this just provides another piece of the story for projects and repositories. For small, personal projects, pushing directly to the main branch is often how people get started. But this can shed lite on troubling occurrences of this behavior that are happening in big projects where pushing directly to the default / main branch is generally ill advised (if not outright dangerous): sole actors who have authority and power to push directly to the main branch could possibly, under extreme circumstances, force push and accidentally overwrite all history, maliciously inject a nefarious commit deep into the reflog, or simply make it much more challenging for the community / ecosystem to know what's going on.

This helps project consumers and maintainers understand when and where this is happening. From there, appropriate adjustments can be made.

Proposed, skunkworks design

I know @isabensusan has some WIP design work going on with this, but to get us started, we could surface a simple table with this data in repo-pages:

Screenshot 2024-06-18 at 5 51 37 PM

or maybe there's something better in our design system we could quickly skunkworks! Implementers choice!! 🃏


Note for the internal team: this is related to the proposal laid out in the API here: https://github.com/open-sauced/api/discussions/835 and implementation in: https://github.com/open-sauced/api/pull/888 - review that proposal for further details and discussion.

github-actions[bot] commented 2 weeks ago

Thanks for the issue, our team will look into it as soon as possible! If you would like to work on this issue, please wait for us to decide if it's ready. The issue will be ready to work on once we remove the "needs triage" label.

To claim an issue that does not have the "needs triage" label, please leave a comment that says ".take". If you have any questions, please reach out to us on Discord or follow up on the issue itself.

For full info on how to contribute, please check out our contributors guide.

nickytonline commented 2 weeks ago

Would it make sense to surface this in workspaces as well where you can filter on repos, or should this be for just the repo page? I think it would benefit both.

Happy to implement this as I've already implemented two of the new tables with TanStack.

jpmcb commented 2 weeks ago

/assign @nickytonline

Would it make sense to surface this in workspaces as well

I think that's a great idea: being able to see where yolo commits / pushes are happening across a grouping of repos would be powerful to deceiver trouble areas in entire ecosystems. I think repo-pages makes sense to start with. And, obviously, things will change with a better / final design. We'd also need some type of aggregate endpoint in the API like v2/workspaces/{id}/yolo vs. calling v2/repos/owner/name/yolo many times over.