meltano / hub

The single source of truth for all Meltano plugins, including all available Singer Taps and Targets: https://hub.meltano.com
https://hub.meltano.com
50 stars 67 forks source link

Yaml key ordering rules in CI are pretty aggressive and not conducive to readability #1167

Open aaronsteers opened 1 year ago

aaronsteers commented 1 year ago

The yaml CI rules regarding key ordering enforce an alpha-sort, which (1) creates a lot of noise on PRs like this one, and (2) they don't help readability because keys like commands and env: float the top while name and variant are pushed to the middle and bottom.

Sample Lint Errors From: - https://github.com/meltano/hub/pull/1164 > ![image](https://user-images.githubusercontent.com/18150651/218587431-1d57f80a-70a9-4336-b626-06f7e62f41f2.png) > ![image](https://user-images.githubusercontent.com/18150651/218587681-fbfde0bf-9a7d-43cc-b869-59f8c550f0d5.png) https://github.com/meltano/hub/actions/runs/4157317609/jobs/7191733845

Proposal options

I'd propose we do one or more of these three options:

Option 1: Remove the sort requirement on human-contributed yaml files.

The point of using yaml was to make it easy and streamlined for anyone to contribute. We slow this process down by requiring an alpha sort of yaml keys.

For machine-compiled yaml artifacts and data resources, we can still enforce sort order - while relaxing this requirement for the contributed files and those being maintained 'by hand'.

Option 2: Create a CI-driven way to auto-format yaml files contained within the PR.

This wouldn't change anything about readability of alpha-sorting, but it would introduce a quick fix option for contributors and maintainers. This option would try to introduce a slash command like /pre-commit fix that users could type into the GitHub PR in order to auto-format the yaml files with whatever standard formatting is expected.

The challenge in implementing this (with pre-commit.ci or another CI-based commit flow) would be in getting it to play nice with branches on forks as well as with branches on the main repo.

Option 3: Create a natural sort of yaml keys that is more readable than alpha-sort.

This is potentially compatible with one or both of the above options. In this option, we make name and variant the first two keys, and we come up with a natural sort order that more closely respects the priority order in which keys are listed on our docs page: https://docs.meltano.com/reference/plugin-definition-syntax

This option pairs well with the CI-driven cleanup above (if a CI bot can also perform sorting in this way). It also could be implemented in combination of Option 1, as a 'default sort' that is not strictly enforced.

pnadolny13 commented 1 year ago

@aaronsteers I agree with this - it was intended to add consistency but I do see how its creating unneeded overhead for community contributors. I agree that alpha sort isn't the most natural (i.e. expecting name or variant up top) but I personally think no order was worse, at least with alpha sort I know where to scroll to when I'm looking for something especially in large files vs the random order we had before.

I'd vote to keep some sort of consistency so personally I dont love option 1 and option 3 sounds good but I dont know how we'd implement those rules. Alpha sort is a simple yamllint rule we enable right now. Option 2 sounds like an easy first step! We also get a lot of formatting errors related to src files that yarn format:write and yarn lint:fix resolve for you, so those could be helpful in a slash command too.

Tangent alert: I think if we invest in tools to make adding/updating hub definitions easier this might not matter anymore because all yaml files become machine compiled. Like when I add plugins to the hub I use my hub-utils script (thats not clean enough to share yet) but it auto sorts for me so I never have to deal with this. Its hard for a community member to manually build these files based on PR/issue templates with no tools, let alone comply with these strict linting rule 😅 .

tayloramurphy commented 1 year ago

@pnadolny13 @aaronsteers It's arguable if alpha sort is really harder to read. I've been quite happy with it for the past few months. Let's stay with the alpha sort and implement some amount of incremental automation with it. We had a "natural" sort previously and it's too easy to forget items if there's some automated checks.

We could also just update an issue template or even put in the docs an "empty" YAML file with all of the possible keys. Folks can copy and paste that as a good starting point.

aaronsteers commented 1 year ago

@tayloramurphy - I think a good focusing question here is whether or not there's an available CLI-based, linter-based, or CI-based flow that we can apply (or recommend the contributor apply) to quickly repair PRs like this one:

Do you know if the linter can auto-format with the given rule?

If not, I still would prefer to remove the CI rule until we have the auto-fix/auto-format capability.

pnadolny13 commented 1 year ago

@aaronsteers theres this script in the repo that updates the files to comply with our yamlint settings. I put a little section in the contributing guide. Yaml lint doesnt have a fix command so I had to write a script manually, which isnt ideal. Changes to our rules will require updates to the script but its worked pretty well for the last few months and I dont see us changing our rules very often.

I think it would be pretty straightforward to run that script as part of a slash command.

aaronsteers commented 1 year ago

I think it would be pretty straightforward to run that script as part of a slash command.

I'm in favor of this approach, but I don't know if that works in forks, which is the primary use case here for community contributions.

aaronsteers commented 1 year ago

Worth noting that the PR I called attention to above wasn't able to be resolved by the contributor, and it was merged without the test passing. This broke CI on main:

Unfortunately, GitHub doesn't have the simple option to require "all" CI tests to succeed. I can name this one as required prior to merge, and I'll do so now.