errata-ai / vale

:pencil: A markup-aware linter for prose built with speed and extensibility in mind.
https://vale.sh
MIT License
4.3k stars 144 forks source link

FEATURE: Make Vale easier to be used in a CI/CD environment #560

Closed Jarmos-san closed 1 year ago

Jarmos-san commented 1 year ago

What's the Issue?

We started using Vale in a pre-commit hook for writing the Starlite thanks to #558. But the hook fails to work as we expect it to in our CI. This happens because the hook's environment is missing the "style" documents that Vale expects.

About the Feature to Resolve Said Issue

We are wondering if its possible to add a --sync (or perhaps a --ci?) flag to the vale command. The idea is to download the necessary style files each time the pre-commit environment is initialised in the CI?

In other words, the said feature would enable us to run:

pre-commit run vale --all-files

...and it should:

  1. Download the styles
  2. Then run the linter on the necessary files.

Possible Workarounds

  1. Pass the sync command as an argument to the hook. But then vale wouldn't lint the necessary files.
  2. Create a step in our CI/CD configuration to download the styles by running vale sync before the pre-commit hooks are invoked. But it introduces redundancy & other inconsistencies.
jdkato commented 1 year ago

I think a more reasonable solution here might be to switch to language: script and perform the set up in a bash script. All you really need is vale sync && vale ..., right? I imagine this would be noticeably better performance-wise than running go install ./... every time, too.

You could probably pretty easily do this in your own repo with repo: local.

Jarmos-san commented 1 year ago

Well sure that's one workaround but it would be better to have a more fool-proof native solution for this concern.

As for the performance concerns while running go install ... shouldn't be a problem since the CI environment is supposed to be clean & its environment built up from scratch each time.

AleksaC commented 1 year ago

As a contributor of the pre-commit hooks for this project I thought I'd chime in here.

Go install shouldn't be a problem for exactly the opposite reason @Jarmos-san mentioned. pre-commit caches it's environment by default and works best if you set up a cache in CI as well.

As for your problem @Jarmos-san there are many ways to solve it by adding a sync step before running pre-commit. You can use something like this:

repos:
  - repo: https://github.com/errata-ai/vale
    rev: 16d3a7f
    hooks:
      - id: vale
        name: vale sync
        pass_filenames: false
        args: [sync]
      - id: vale

However the way I'd solve this (and probably will if I adopt any of the packages in my project) is by adding the package to the version control. I don't know how syncing in vale works but if there isn't a way to pin versions of packages you could get a different version of a package for your runs which could break the workflow (it's probably unlikely to happen in practice but could still happen). Another potential problem is that users could have a different version of packages locally leading to inconsistencies with CI. From what I've seen Google package is ~140Kb of mostly yaml files which IMO is fine to store in the repo.

Jarmos-san commented 1 year ago

@AleksaC great! Thanks for the suggestions, it works as we expect it to (on our local dev environments). But it fails in CI because rst2html doesn't exists in that environment.

AleksaC commented 1 year ago

From what I can see in the docs seems like you need to install either sphinx or docutils globally.

jdkato commented 1 year ago

However the way I'd solve this (and probably will if I adopt any of the packages in my project) is by adding the package to the version control. I don't know how syncing in vale works but if there isn't a way to pin versions of packages you could get a different version of a package for your runs which could break the workflow (it's probably unlikely to happen in practice but could still happen). Another potential problem is that users could have a different version of packages locally leading to inconsistencies with CI. From what I've seen Google package is ~140Kb of mostly yaml files which IMO is fine to store in the repo.

You can choose to pin to a specific release version or use the latest which (in theory) only changes with the upstream style guide you've chosen to follow.

In any case, I think your example of running a sync step is the solution here. Here's what I'd use:

repos:
  - repo: https://github.com/errata-ai/vale
    rev: 16d3a7f
    hooks:
      - id: vale
        name: vale sync
        pass_filenames: false
        args: [sync]
      - id: vale
        args: [--output=line, --minAlertLevel=error]