mozilla-releng / firefox-infra-changelog

Automated tool which builds a changelog of commits happening on git and hg that could affect Firefox CI Infra.
2 stars 8 forks source link

[Core] Save commits based on repository local configuration file. #430

Closed danlabici closed 5 years ago

danlabici commented 5 years ago

Overview of problem:

Currently FIC logic is tied to 1 important file repositories.json which tells us:

  1. What repositories we need to track
  2. What files/folders with need to track (or ignore).
  3. What "type" of repository it is (so we know the logic to apply when filtering).

This file always needs to be up to date and honestly it will be bothersome to always keep in mind "Hey, did we add a new folder/file that will affect Infra? Okay, lets also track it in FIC"

Possible implementation:

I brainstormed the problem and I figured out that FIC could act more like "a service" which looks into the repository for a configuration file, just like TravisCI (travis.yml), TaskCluster (taskcluster.yml) and PyUp (pyup.yml) currently do. FIC could look for a file named fic.yml (open for name suggestions) in each repository.

This opens up some amazing functionality that will automate FIC up to 100%! How so?

Here are 2 possible paths I'm thinking we could implement this:

1. fic.yml becomes a repo requirement:

This would be "the dream" where both Git and HG repos are required to have a fic.yml file, if that specific repository affects infrastructure.

We already have the information, so CiDuty could make the initial PRs containing the file + data, for all the repositories we currently track.

Once the file is landed, all we need to look for (as an example) is all Repositories in a Github Team (eg https://github.com/mozilla-releng ) iterate thru all repos and only track the ones that have fic.yml. If the file is missing, we ignore the repo (possible reasons: Doesn't affect infra, Repo is frozen/decommissioned)

2. fic.yml doesn't becomes a requirement, but some repos will use it:

In case fic.yml doesn't become a requirement, we can still use the repositories.json file that we currently have, but if fic.yml is present in a set repo (eg: build-puppet) instead of using the "files/folder we care about" from repositories.json we will use the data from fic.yml.

Benefits:

Possible structure of the yml file:

Track ALL files in this repository:

firefox-infra-changelog:
   track: 
    - ALL

Track ALL files in a folder.

firefox-infra-changelog:
   track: 
    - some/path/here/

Track Only the following Files

firefox-infra-changelog:
   track: 
    - some/file/in/a/dir/config.txt
    - secrets.py

Ignore ALL files, except

firefox-infra-changelog:
  ignore:
    - ALL 
  track: 
    - some/file/in/a/dir/config.txt
    - some/path/here/

Possible combination of all example above:

firefox-infra-changelog:
  ignore:
    - README.md
    - travis.yml
    - taskcluster.yml
    - .git/
    - docs/
    - tests/
  track: 
    - some/file/in/a/dir/config.txt
    - some/path/here/
    - config.py

For this to move further, @klibby , @escapewindow , @davehouse , @JohanLorenzo , @lundjordan what do you guys think? In my opinion, it will be far less "involved" to just drop a fic.yml file in a repository, when compared to manually change FIC files and making a PR/Land the changes.

klibby commented 5 years ago

My two cents: you won't be able to get away from either manually keeping repositories.json up to date or chasing after people to keep fic.yml up to date.

In both cases you have to "get the word out" that this exists, and once you've determined what repos should be tracked you still need to ensure that the file in question is maintained. If you assume that people will update fic.yml as required, things WILL get missed or forgotten and you'll still have to go back and chase people or update things yourself.

IMO, I'd look at just tracking repositories as a whole and not worrying about which files/folders to track, and trying to programmatically determine the repository type (or doing away with it; it's not clear from the json file what it's used for). It's probably worth separating out the Firefox trees from others, since those will be really noisy if you track all files, and it looks like they're all the same except for the repo name.