Closed andymeneely closed 10 years ago
From one quick manual inspection, it looks like there are some misspellings like "chormium.org" instead of "chromium.org". Maybe do our own rename?
We've had a clean build for a few days now, so we can now do this. Assigning to Kayla.
From @kayladavis (not sure why the email Github thing didn't pick this up...)
There seems to be a lot of overlap with the local between chromium.org and gmail.com, but there are also duplicate locals with other domains too. Just from sorting them and starting to go down the list I found these:
(aa.chromium@gmail.com, aa@chromium.org),( abarth@gmail.com, abarth@webkit.org, abarth@chromium.org), (akalin@chromium.org, akalin@gmail.com),( albertb@chromium.org, albertb@gmail.com), ( amanda@alfar.com, amanda@chromium.org)
Hm... can you look into some of those to see if they are in fact the same person? Maybe we can translate them to @chromium.org emails. I'm a little concerned about accidentally collapsing two people, e.g. bob@chromium.org and bob@gmail.com
I'll do a group-by to see how many developers would be reduced if we did this.
With things that might not be one person and are related to chromium/chrome:
chrome-admin@chromium.org
chrome-apps-syd-reviews@chromium.org
chrome-ui-leads@chromium.org
chrome-ui-review@chromium.org
chrome-valgrind-team@chromium.org
chrome@cybernium.net
chromeos-lkgm@chromium.org
chromeos-privacy@chromium.org
chromeos-security@chromium.org
chromepmo@chromium.org
chromiumproblem@gmail.com
We also seem to have a possible problem with people doing 'chromium.org' and 'chromuim.org' see: (groby@chromium.org, groby@chromuim.org) and also with 'chroium.org' (grt@chroium.org), and 'chcromium.org' (hbono@chcromium.org).
Ok that should be another misspelling handled by #150.
Finished my quick-ish skim. I've learned how many ways chromium can be misspelled. All of these have an actual chromium.org that goes with it.
'chroimum' (jamesr@chroimum.org) 'chromioum' (jcampan@chromioum.org) 'chroimum' (leviw@chroimum.org) 'chromium.com' (mpcomplete@chromium.com) 'chromoium' (nkostylev@chromoium.org)
I'm not sure about stuff like this either: open-source-third-party-reviews@chromium.org
Holy cow that's funny. Let's include all of those as corrections.
I'm curious about that open-source-third-party-reviews one - can you find a situation where that was used?
Also, I've got an interesting query for you to run from psql:
SELECT * FROM
(SELECT count(*) as num_dups,
string_agg(email,',')
FROM developers
GROUP BY (substring(email from '^.*@'))
) as countquery
WHERE num_dups > 1;
Return 120 rows, all of which are emails with the same local but different domains. This would include misspellings, but there are plenty of situations where it looks like they're the same person. Look into some of these and see if you can know for sure that they are the same person.
Also, are these the same people? matt@tolton.com,matt@gundam.eu
In reference to your checking for gmail.com/chromium.org I have some data to show that two of these duplicate locals are the same person. There might be a pattern of the gmail.com ones being older and no longer used, but I'm not sure . Also I can't tell if matt@tolton.com and matt@gundam.eu are the same people, there are only three codereviews between those emails, but the names don't match up.
aa.chromium@gmail.com and aa@chromium.org as seen in these issues: (https://codereview.chromium.org/331563003/, https://codereview.chromium.org/9963133/)
akalin@chromium.org and akalin@gmail.com as seen in these issues: (https://codereview.chromium.org/209070/,https://codereview.chromium.org/138273017/)
Regarding the third-party-reviews. This page http://www.chromium.org/developers/adding-3rd-party-libraries states that: "All third party additions should go through a Chrome Eng Review before being checked in. The initial submission (and any substantive change, like relicensing) of third party code requires review from open-source-third-party-reviews@google.com and security@chromium.org (ping the list with relevant details and a link to the CL).
It seen as a reviewer in this issue: https://chromiumcodereview.appspot.com/291783002/. chromium-reviews is cc'ed, and has a message in the issue. So chromium-reviews might be what handles this.
Right now I can't say for sure if reviewers with the same local are the same person.
Of course there are 117 more that haven't been checked with the duplicate locals. I'm not sure if we can cover this since it's only clear in some cases that the duplicate is also the same person.
Let's call this done and revisit a little bit later with more people. LGTM
Once we get a clean build, use this command
psql chromium_real -c "SELECT email FROM developers"
to inspect the list of developers in the system. Look for bots or weird email accounts. Try to look for potential duplicates or something like that. Comment below with any weird cases you find.