Closed andymeneely closed 10 years ago
Also, work with @toroidal-code on this. This task involves some design work and some new models.
Need more clarification: • There is no Developer Model for us to relate to Owners, will we need to generate that as well? • Do we need to keep tracks of specific dates of addition and removal (i.e.- an owner can be added and removed several times for the same file) or keep track of the last removal and addition to?
Notes & Thoughts: • As of now we will need 2 new models: Owners & OwnerDates. OwnerDates will represent the dates to which an Owner was added or removed to/from a Owner file so we can implement the dev.is_owner?(date) • Biggest challenge is getting those dates based on the git log history using the commits model.
Tasks • @toroidal-code will be working on the Owners' file parser. • I will be finishing up my current task for commit files and revisiting the dates associated for those to find the best way to extract the dates for the Ownerdates
I think the developer model is supposed to be created based on issue #27 , which I was going to be working on tonight at 7:30 like my email said...I was hoping to get some feedback from Meneely about my questions for that but I havent heard anything
Yes, use the Developer model that Shannon will make.
Yes, keep track of all datse. Since OWNERS change over time, we'll need to look up whether or not a developer was an OWNER at the time of the code review, hence this method call dev.is_owner(date)
I think all you need is Developer
and OwnerDate
, although I'm thinking Owner
is what it should be called. Maybe have that with a start_date
and end_date
, with no uniqueness constraint on the Developer association? So I think, technically, all you need is one table for it.
Yup, getting those dates based on the git log will be challenge. I suggest writing some scripts that search the repository and outputting those to data files we can just load each time.
Please create appropriate small-scale test data in test/data
as well.
Let's revise it this way:
First, let's get dev.owner?
working. Ignore dates, files, etc.
Next, we'll need to focus on the more complex data collection scripts and modeling for dev.owner?(file,date)
. This means going through Git diffs, parsing the BNF, etc.
Here's what I've got so far in finding all modifications to OWNERS files git log --stat --all --pretty=oneline -- '*OWNERS'
If you just want the commit hash then this will do:
git log --stat --all --pretty='%H' -- '*OWNERS'
What I've ended up using, as it's the most machine-readable: git log --name-status --all --pretty="format:%H" -- '*OWNERS'
Haven't seen much activity here. Let's put this on ice for now and focus on other issues.
I'm closing this for now. We may want to refer to it later, but we'll approach our OWNERs analysis differently this year.
The end-goal is to have two methods in Developer that look like this:
dev.is_owner?
would tell us if this person ever is an OWNER in any filedev.is_owner?(date)
would tell us if this person ever was an OWNER in any file as of that date.To get there, we'll need a few Owner relations that keep track of the history.
We'll have to go into the Git history to get every copy of each OWNERs file, so perhaps parsing the Gitlog is a prerequisite (#25)