Code-dot-mil / code.mil

An experiment in open source at the Department of Defense.
https://www.code.mil
MIT License
1.28k stars 122 forks source link

Refactor code.json file generation #232

Closed jgarber623-gov closed 6 years ago

jgarber623-gov commented 6 years ago

This PR is tangentially related to #231.

TL;DR: managing code.json just got a bit easier!

Details and whatnot…

This pull request breaks apart the increasingly large code.json file into smaller, release-specific files and adds a Jekyll plugin to build a valid code.json from these files. The smaller, release-specific files are organized in src/_releases/ and are organized in the form of <repo domain>/<repo org/username>/<repo name>.json.

For example, the code.mil repo's JSON file is located at src/_releases/github.com/Code-dot-mil/code.mil.json. Capitalization isn't required, but proper casing is encouraged to match the organization name (or username) and repository name.

Adding a new project/release to the code.json inventory file would now be a matter of creating the appropriate folder structure and JSON file in src/_releases with the relevant release-specific values from version 2.0.0 of the code.json schema.

jordangov commented 6 years ago

I'm wavering on the file structure... seems heavy. I talked to @jgarber623-gov about using a flat-file structure, but then we have to get into file name parsing pretty heavily. I guess my question would be: if someone is not on github, and their chosen platform doesn't do good URL path management, then what would we expect the outcome to be? That is... let's say I use ddsgit.com and my repo is at: https://www.ddsgit.com/orgs/team-america/jordan/awesome-project/view/index.html ... how would you expect that entry to look?

jgarber623-gov commented 6 years ago

@jordangov My goal for the folder structure, slightly heavy though it may be, is to use it as a base for automated generation of new release inventory files when we get further along with #231. (There's also to my knowledge no requirement in the schema that the name property be unique.)

So for example…

  1. User fills out web from with project details, including a unique repository URL,
  2. A future automated submission handling script strips that URL of the scheme (e.g. https://) and splits on slashes (/) to generate a unique path and file name,
  3. The new file is populated with the appropriate JSON from the form submission, and
  4. A PR is created on-the-fly.

Barring any real world examples, using repository URL is the easiest way to guarantee we don't create conflicts with similarly-named repositories.

The Jekyll Generator doesn't use the folder structure, either (JSON files are globbed with src/_releases/**/*.json). Its simply a convenience for us when adding new projects.

All that's to say: I think we're okay with this pattern for now and we should adjust down the line if the need arises.