tenex / opensourcecontributors

Find all contributions for a user through the GitHub Archive
91 stars 11 forks source link

Provide data in json via very simple API, suitable for further analysis by the users #87

Open nealmcb opened 7 years ago

nealmcb commented 7 years ago

As noted in #5, #6, #83, #86, etc, users are often interested in further analysis of the results of their searches.

One simple way to help some of them would be to provide a very simple API to opensourcecontributors to provide the search results data in json format. It would ideally be structured so as to make it easy for users, via the github api, and their own language of choice, to do further queries and analysis. A few examples would go a long way.

This would probably also result in code that was useful for implementing the feature requests listed above.

hut8 commented 7 years ago

There already is a simple API, actually. The front end uses it directly. No authentication is necessary. Would you be interested in documenting it?

These are the only endpoints: https://github.com/tenex/opensourcecontributors/blob/master/ghc-app/controller.go#L34

nealmcb commented 7 years ago

Hmm. I tried all those endpoints, and the /user... ones return html, not json. E.g. https://opensourcecontributo.rs/user/nealmcb

The others were just 404 for me. Am I missing something?

joshjordan commented 7 years ago

Yeah, the missing piece of the puzzle is that nginx routes to the API only under the /api/ path. Check it out:

https://opensourcecontributo.rs/api/user/nealmcb https://opensourcecontributo.rs/api/user/nealmcb/events etc

nealmcb commented 7 years ago

Indeed - thank you!

I hope to find time to come back to provide proper documentation, but here is an API usage example, in Python, for how to retrieve all events for a user, and hints on sorting them out:

import json
import urllib.request
import codecs

def getevents(userid):
    "Retrieve and return all event pages for given userid"

    reader = codecs.getreader("utf-8")

    events = []
    pagenum = 1
    while True:
        url = "https://opensourcecontributo.rs/api/user/{}/events/{}".format(userid, pagenum)
        page = json.load(reader(urllib.request.urlopen(url)))
        if page["size"] == 0:
            break
        events += page["events"]
        pagenum += 1

    return events

events = getevents("myuserid")  # put userid you want in here

events is now a dict, and the type field indicates whether it is an IssueCommentEvent, PushEvent, IssuesEvent, GollumEvent, CommitCommentEvent, etc.

nealmcb commented 7 years ago

See also: