MetricsGrimoire / Bicho

Bicho is a command line based tool used to parse bug/issue tracking systems
http://metricsgrimoire.github.com/Bicho/
GNU General Public License v2.0
68 stars 30 forks source link

name field not been filled in people table #122

Open jgbarah opened 10 years ago

jgbarah commented 10 years ago

The current version of bicho is not getting the name for each github user involved in an issue, and is not filling the people table with it.

jgbarah commented 10 years ago

After looking at the github API documentation, it seems that getting the name for a github user involves an extra call to the API (per user), since this is not returned in the query to get information for an issue. Therefore, I suggest that when a new entry is inserted in the people table, a call is done (for eg. user jgbarah) to:

https://api.github.com/users/jgbarah

which returns a JSON document with the name:

{
  "login": "jgbarah",
  "id": 1039693,
  ...  
  "name": "Jesus M. Gonzalez-Barahona",
  ...
}

So, it is just a matter of getting the name field in that JSON document...

jgbarah commented 10 years ago

I'm working on this, but I'm finding some problems.

I started by writing some code to get the GitHub API user JSON document, and extract from it its fields (all of the code below in backends/github.py):

class GithubBackend(Backend):
    ...
    def __get_user(self, username):
        url = "https://api.github.com/users/" + username
        base64string = base64.encodestring(
            '%s:%s' % (self.backend_user,
                       self.backend_password)).replace('\n', '')
        request = urllib2.Request(url)
        request.add_header("Authorization", "Basic %s" % base64string)
        result = urllib2.urlopen(request)
        content = result.read()
        user = json.loads(content)
        return user

And the code for inserting the resulting name field in the People database:

        submitted_by = People(bug['user']['login'])
+        user = self.__get_user(submitted_by.user_id)
+        submitted_by.set_name(user['name'])

And the same every time after People objects are instantiated, for filling assignee and by (in comments) and by (in activities).

But the problem is that I shouldn't be calling _get_user every time I find a person, because in most cases that person is already in the database. I should only call __get_user the first time a person is found, and in that case, assign the proper name to it, and store in the database (that is, only one call per person). And I don't find a condition to match this first accourrence.

I've tried with

if submitted_by.name is None:

But the People object is always intialized "from scratch", without consdiering what is already in the database. That means that submitted_by.name is always None...

So, I'm wondering how I can actually check whether the entry in the people table for a specific people.user_id is already present, or at least whether the people.name for it is not None ("None" in the database). Another idea would be to initialize the People object with the actual data for that person in the people table, if present, instead of starting always from scratch, but I don't know if that would be more complex.