andymeneely / chromium-history

Scripts and data related Chromium's history
11 stars 4 forks source link

Are people using multiple email addresses? #90

Closed andymeneely closed 10 years ago

andymeneely commented 10 years ago

Take a look at our development data code review 5831706594508800 - looks like we have two reviewers that look like the same person:

skaslev@chromium.org skaslev@google.com

How often does this happen? What types of disamiguation should we do? Maybe something as a part of rake run:consolidate?

andymeneely commented 10 years ago

Assigning this to @bspates

bspates commented 10 years ago

@andymeneely, which is the current production db? Is it chromium_real? I just wanted to get a good sample of emails to test with.

andymeneely commented 10 years ago

Sadly, I made a mistake and blew away production last night. It'll rebuild tonight and you should have access to it.

bspates commented 10 years ago

Looks like I don't have permission to query the chromium_real or chromium_real2 dbs

bspates commented 10 years ago

Looks like there is a google group under committers@chromium.org, that lists all the commiters to the project. Aparently having write access is based of them scraping this list for the email. https://code.google.com/p/chromium/issues/detail?id=237072, this conversation talks about eventually allowing people with non "@chromium.org" addresses to commit. This seems confusing considering the data we've gathered.

bspates commented 10 years ago

@andymeneely forgot to put your name in the last two comments, but I need read access for the prod data, and checkout the comment before this one cause I'm a little confused by some stuff i found.

andymeneely commented 10 years ago

Oops, right. Just work with chromium_real and not chromium_real2, you have read permissions but not write.

andymeneely commented 10 years ago

Here are the measures we discussed that we will take:

In a separate consolidation routine AFTER all data has been loaded,

Once we do all of the above, let's start a manual investigation of the ones we have left. We can do a manual mapping of our own.

bspates commented 10 years ago

@andymeneely looks like i lost permission to the chromium_real db again .

andymeneely commented 10 years ago

Shoot. Looks like this is going to happen every day because of the way I do the rename from chromium_real2 to chromium_real. I just gave you admin. Be careful.

bspates commented 10 years ago

PR: https://github.com/andymeneely/chromium-history/pull/122

bspates commented 10 years ago

From the prod logs it looks like some gtemp accounts snuck through somehow, also it seems to be rejecting some perfectly good email address, will investigate.

andymeneely commented 10 years ago

Closed for now since we've gotten it all tested and running with the new performance optimizations. Other issue #143 open.