Populate `requirements.yml` from GitHub API

ReeceStevens commented 6 years ago

Use the GitHub API to construct the requirements.yml file and problem-reports.yml files.

Both of these files should have the same format:

1: "description of the requirement or problem report in GitHub flavored markdown"
3: "description of the requirement or problem report in GitHub flavored markdown"
4: "description of the requirement or problem report in GitHub flavored markdown"

We should generate both yaml files using a new Makefile rule and a new rdm subcommand.

This will require:

[ ] Developing a thin interface to the GitHub API (authentication, depagination, issues)
[ ] Processing API responses and producing the .yml file

We should use an existing GitHub API library, e.g. http://pygithub.readthedocs.io/en/latest/introduction.html

ReeceStevens commented 6 years ago

Relevant API documentation: https://developer.github.com/v3/issues/

johndgiese commented 6 years ago

@ReeceStevens I added a few more details to the GitHub issue.

ReeceStevens commented 6 years ago

Some points of clarification as I work through this ticket:

Where do we want the user's GitHub information to be stored? init/data/system.yml? If that is the case, we cannot auto-populate requirements.yml and problem_reports.yml during the first call to rdm init. I don't think that's a big issue, but something worth noting. We will also need the user to have a GH access token in an environment variable or some other secure location. I assume we could also have them store a limited access token or an application-specific token in the system.yml file.
The index for the requirements is currently the order in which they are returned by the github API. This seems kind of brittle, so perhaps a better way to assign an "ID" to a requirement is to have a specific tag (similar to using the r-number) that we can identify as a requirement ID. What do you think about this approach?
I don't want to design for features that don't exist yet, but I was considering moving the GitHub-specific code into a backends directory-- that way, depending on the specified project management software specified in system.yml, we can just import the correct backend and start syncing. It also emphasizes that the core functionality isn't tied to GitHub as a platform, which might be a nice thing to show off.
I am trying to come up with a term that encompasses both requirements and problem reports, and am struggling to find a good fit (for the rdm <subcommand>). The ones I'm working with so far:

project_metadata
project_docs
docs (the term seems overloaded in our context so I don't feel great about it)
requirements (doesn't cover problem reports but is the closest to concisely conveying the idea)

johndgiese commented 6 years ago

Great questions:

Ideally, it would be nice to piggy-back on the user's SSH key or username/password when they run the rdm command. At some point in the future we would probably also need a way to set it using an environment variable, so that it could be run securely from a CI server, but lets wait until we need this functionality before we worry too much about it. If piggy backing off the user's github credentials isn't possible, then lets jump on a call and brainstorm other ideas.
I was thinking we would use the Github Issue numbers as our unique ids, since we know these will be uniquely assigned when you create GitHub issues. What do you think?
I think for now lets just get it working with Github; it seems to me that, as long as the intermediate YAML format is general enough, we can figure out everything else later on down the road as it becomes necessary. I.e., it doesn't seem like splitting stuff up now will save us any effort in the long run, since we probably won't split things up properly anyway. For example, with our GitHub scheme, we are storing problem reports and change requests (and possibly requirements) in the same place. Thus, we can grab all three objects from GitHub at once. But with other project management tools, we may need to do things completely separately. Thus our makefile recipes will probably be different for GitHub vs Pivotal vs Jira. I think we will probably need to setup the rdm init tool to generate different makefiles for different setups. Thus, switching from Github to Jira would be do-able, but not trivial either. I think this is fine for now; I suspect switching between project managers will be a very uncommon task.
In light of this, perhaps we can make the rdm command completely GitHub specific? E.g. we could have rdm github_issues which would save requirements.yml and problem-reports.yml? Just an idea.

I would say, overall, lets not worry much about making this generic for now.

ReeceStevens commented 6 years ago

Okay, sounds good. I will look and see if we can piggyback off the ssh key... I am not entirely sure if we can or not, but it's worth a shot. Worst case scenario, setting up an access token for the GitHub API took less than 5 minutes for me-- I think we could point people to the token generation instructions if there's not a better way immediately available.

GitHub issue numbers probably would work just fine for unique IDs-- however, the YAML file probably will be a bit confusing because the requirement IDs will just be numbers, but the numbers may or may not be sequential. Not a huge issue, and we can always change it later with relatively little friction if we decide we need something more.

And good call on sticking with github for now. I like rdm github_issues as a subcommand.

ReeceStevens commented 6 years ago

After doing a bit more digging, I am pretty sure that we can't authenticate users for the github API via SSH-- you have to have an access token or pass in username/password.

ReeceStevens commented 6 years ago

@johndgiese this is my solution to the login issue for now: 59397c0

It checks if an API key is set, and if not, will fall back to username and password.

innolitics / rdm

Populate `requirements.yml` from GitHub API #3