andymeneely / chromium-history

Scripts and data related Chromium's history
11 stars 4 forks source link

Can we measure the OWNERship change of ReleaseFilepath? Brainstorm metrics. #181

Closed andymeneely closed 9 years ago

andymeneely commented 9 years ago

@rikal is driving this, but @dani5447 is familiar with this area.

These OWNERs files are an intriguing data source (http://www.chromium.org/developers/owners-files). These files are version-controlled, and regularly change over time. They have a hierarchical structure that implies some sort of team. They have per-file regular expression-like rules that can make our data collection rather... interesting.

My initial thoughts about ownership is that we should analyze the change in ownership (ownership churn?) for a given file. Changing ownership can be healthy, or perhaps unhealthy.

Maybe there's also a measure of how much ownership a given developer has? Perhaps dividing their attention too much?

You'll need to:

andymeneely commented 9 years ago

Take a look at Chris Bird's paper on major/minor contributors (and some of the papers that reference it) for some inspiration. They don't have explicit data on ownership, but maybe a comparison would be good? Explicit vs. actual. Bureaucracy vs. activity.

andymeneely commented 9 years ago

From @bspates: What if we have an owner who is a minor or non-contributor who is LGTM'ing things?

andymeneely commented 9 years ago

From @Felivel: try out Git by a Bus for ways of measuring ownership. http://dev.hubspot.com/blog/bid/57694/Git-by-a-Bus

dani5447 commented 9 years ago

I'm having some trouble finding papers by Chris Bird. Does anyone know of a good place to look for it?

Felivel commented 9 years ago

Did you try his personal web site? http://www.cabird.com/

dani5447 commented 9 years ago

Ah, ok thank you! That didnt come up in my google search for some weird reason.

dani5447 commented 9 years ago

Brainstorming Notes from 9/9 Meeting with @rikal + my own additional ideas:

During meeting we agreed these last two ideas might be some of the more interesting ones.

andymeneely commented 9 years ago

Good stuff! Here are some of my thoughts on those: "First commit" - definitely easy to get. Seeing a "Time To Owner" would be an interesting analysis. I can see that number being too short or too long.

For the lower-level OWNERS idea, we might able to measure something like "number of files a developer owns" vs. "number of OWNERS a file has". That would be less directory-heavy.

Inactive owners: yeah if we combine this work with major/minor contributors we can see if these align together. Or combine this idea with managerial or not. Peter Rigby's and Chris Bird's FSE paper last year touched upon this briefly and we can explore it further.

I'm wondering if we subdivide the system into teams and measure the teams by defect density and vulnerability density, then we might be able to evaluate whether or not teams are on the right track or not.

Good! Keep going. Hammer these out to metrics and some sort of analysis. Don't let what we currently have of a code base get in the way of a good idea.

dani5447 commented 9 years ago

Some initial metrics concepts: - to be reviewed with @rikal during meeting

andymeneely commented 9 years ago

My meeting with @rikal:

I like these metrics for being added to release_filepath

For the number of files that the OWNER owns. take a look at this paper and consider the Entropy metric: http://macbeth.cs.ucdavis.edu/enteco.pdf

dani5447 commented 9 years ago

From meeting with @rikal :

dani5447 commented 9 years ago

From meeting with @rikal :

rikal commented 9 years ago

Just expanding on the notes from Danielle, The metrics under consideration right now are focused on these areas:

Team Makeup

Ownership Burden or Load

Share of Contributions (in File or Directory)

Size of Contributions: Churn (File/ Directory)

Further thoughts are on Change in team makeup for OWNERS over time, focusing on deletions (loss of knowledge), additions (increased risk due to inexperience?)…

andymeneely commented 9 years ago

Latest metric list:

Team Makeup

Ownership Burden or Load

Share of Contributions (in File or Directory)

Size of Contributions: Churn (File/ Directory)

Further thoughts are on Change in team makeup for OWNERS over time, focusing on deletions (loss of knowledge), additions (increased risk due to inexperience?)…

Metrics we want to use but haven't figured out yet: