sundeck-io / tableformats

Creative Commons Attribution Share Alike 4.0 International
2 stars 1 forks source link

How could we improve the bus factor metric? #2

Open jacques-n opened 7 months ago

jacques-n commented 7 months ago

I've gotten pushback from several people privately about the statement that Iceberg has a bus factor of two people.

Some common complaints:

  1. The bus factor calculation uses 50% cumulative code coverage as the number of people who are critical to the project. 50% is arbitrary.
  2. The bus factor calculation under-appreciates diversity of contribution by only looking at a fraction of "core" code.
  3. Constraining analysis to "core" code unfairly penalizes projects that are more mature and have a unchanging core.
  4. The bus factor calculation is only looking at the current state of code, not the sequence of operations. This means that last touch wins, providing misleading results. (And mechanical refactors can skew results.)
  5. Can we come up with other ways to calculate bus factor that would be better?
  6. Is there additional data or intuition that corroborates or conflicts with the bus factor calculation (e.g. does the metric make sense)?
  7. Why didn't base this analysis on PMC member count?
  8. Do I really think if two people were hit by a bus, the Iceberg project with end?
jacques-n commented 7 months ago

Some thoughts on each question:

  1. 50% is arbitrary but the specific benchmark doesn't matter as much the relative number across projects. The same pattern can be seen at 50%, 75% and 90% coverage where the Iceberg project has less people involved in each group.

  2. The bus factor analysis isn't really interested in diversity of initiative as much as ownership of core code. As mentioned in the analysis, core code is often owned by people most instrumental in the project whereas things like connectors, language bindings and catalog implementations might be very tangential to the project. As noted elsewhere, all three projects have several hundred committers who have made at least one contribution. The vast majority of those people could disappear and it wouldn't be problematic to the project. This calculation isn't really interested in the long tail and thus it is helpful to focus on the core impetus of the project (in Iceberg's case, spec and the core library)

  3. The speed of change isn't really considered in the bus factor calculation. If only one person worked on code and that code is really important, the exposure is much the same whether they worked on the code recently or a long time ago. Further, I don't believe there is substantial difference between the maturity of the three projects analyzed. (For example, I think in many cases Iceberg is actively adding features already in Hudi and/or Deltalake.)

  4. This is definitely a problem. In a perfect world, it'd be nice to look at code touch ignoring cosmetic changes but it wasn't clear how to do this. Open to ideas for a better analysis of this.

  5. There are surely other ways. A couple that would be great to look at:

    • What is the involvement (and relative authority of statements) for major contributions. E.g. are there a small number of people who are highly involved in all major reviews of things like the specification or the most important connector implementations?
    • Relative progress of operations when top contributors go on vacation and whether people are more likely to porcede or pause progress until "key people" return to provide reviews/guidance.
    • Snapshot the same metric at quarterly or yearly milestones over several years to see whether it is stable changes in intuitive ways.
  6. Intuitively, the two things that seem a big awry talking to a bunch of people is:

    • Anton's contributions are not accurately represented. Overall, he is the second biggest contributor the Iceberg project and yet doesn't come up in the bus factor analysis. Given his impact to the project, he is likely on the list of critical resources. This suggests that it might make sense to include some core code from the Spark client given the relative importance of that client (note that it is included in the Delta analysis because the "core" isn't abstracted).
    • Eduard's contributions may be overstated due to some refactoring work (this is not data driven, was provided as casual feedback and has not been vetted).
  7. If you review the PMC member's list, a large portion of those people are not actively involved in the project and may have done zero or near zero contributions ever. PMC lists are often initially populated with people who might help and it takes years for them to be primarily people who do help. As an example, of the first 20 PMC members on Apache Arrow, I believe that only 1 or 2 are actually involved in the project at any level.

  8. No. Iceberg is too important to too many organizations to be destroyed by the loss of any developer. However, the purpose of the metric is to provide directional guidance. In that, it seems helpful to identify that ownership and top-karma is potentially held amongst fewer people than ideal.