Closed cdolfi closed 11 months ago
@JamesKunstle @oindrillac @hemajv @mariashev curious to hear yalls thoughts on this idea
@cdolfi I like the seed of this idea a lot- it's more doable than visualizing the (contributor<-->file) network I've suggested before. I think this is a great first step in the direction we're interested in. I do think that building a visualization of this is tough.
Starting from the ideal, it'd be awesome if we could render a heat map over the tree-shaped file structure of a project / sub-part of a project.
For instance, if we rendered something like 'tree
redis
├── Dockerfile
├── dump.rdb
├── redis.conf
└── redis.conf.bak
And then had the filenames and folders colored relative to contributor attendance, or a bar along the righthand side of the files colored accordingly.
I'm not sure how to do that, though. Maybe @GregSutcliffe would have an idea for this?
So yes, I've done variants of this before. One example can be found here in which I generate a set of "bubbles", one per file, coloured by the average staff/non-staff commits to that file. I then roll that up the tree to get a colour for each parent directory in the tree, until you get to the root (and an overall average).
The code (while old) should still work in R, and I'd be happy to go into more detail if this looks useful.
The following visualization will be described in the lens of reviewers that have priorly reviewed code in that segment (folders or files, TBD on how granular it should go) of the codebase. This concept can be applied to contributors that have priorly had a pr merged on the segment of the codebase.
The initial concept of the visualization is as followed:
Then one of the following plotly heat maps is used: https://plotly.com/python/heatmaps/ https://plotly.com/python/2D-Histogram/
x axis: repository folder (or file, need to workshop this a bit) y axis: date by month in descending order z axis (color): Number of reviewers that have been active in the time interval (relative to the month block)
Open to hearing other ideas on how this data should be visualized. The information I would like to get across to the user is if there is a segment of the codebase that might be loosing knowledge retention or is trending in that direction. How exactly to show the number of reviewers (contributors) and time in both contexts (time since a reviewers last activity and distribution of that in combination of number of reviewers) is something I dont have a locked in vision for. The initial idea I have proposed is just a starting point.