nielsbasjes / codeowners

A library to use and verify the CODEOWNERS (both the Github and Gitlab variants) and .gitignore files
Apache License 2.0
5 stars 4 forks source link

Consider exposing a CLI #124

Open vincentjames501 opened 4 months ago

vincentjames501 commented 4 months ago

We are planning on using this library for some CODEOWNER approval validation. More specifically, we'd like to do what other CLIs do such as https://github.com/hmarr/codeowners and be able to show unowned files (though we need GitLab syntax support).

This is one of the few parsers that works with GitLab's syntax which is awesome!

I was reading a few articles that got me thinking this would be super awesome to expose as a CLI tool:

https://engineering.peerislands.io/how-to-build-native-cli-apps-using-java-maven-graalvm-picocli-jreleaser-and-github-actions-1407693d99ff

Example repo: https://github.com/rrajesh1979/ref-java-jwt

Basically we could create a maven module here that uses Picocli and then uses JReleaser to build a GraalVM native image to publish to a nielsbasjes/tap.

nielsbasjes commented 4 months ago

The current way I do this exact same thing is by putting it as an enforcer rule in my pom.xml during the build (see documentation). That gives me the list of unowned files and fails the build if there are any files that are not owned.

What would a separate cli edition help with?

vincentjames501 commented 4 months ago

@nielsbasjes , I think if you're in the mvn ecosystem I mostly agree with you. We mostly use Clojure though and leverage things like https://pre-commit.com/ for git hooks as well a CI linting rules.

We generally want this to be super fast on folks machines to not interrupt their workflow. Additionally, I think by giving a CLI we could allow completely different ecosystems (think Python, Ruby, PHP, etc) to leverage this as well since by compiling to a GraalVM binary it's platform agnostic and super fast (see below). It's sort of surprising how few solutions to this are out there so this lib is great!

I started playing with this last night (it's not done):

https://github.com/vincentjames501/codeowners-cli

To install:

brew tap vincentjames501/tap
brew install codeowners-cli

Usage:

$ codeowners-cli help

Usage: codeowners-cli [-hV] [COMMAND]
Process CODEOWNER files
  -h, --help      Show this help message and exit.
  -V, --version   Print version information and exit.
Commands:
  help    Display help information about the specified command.
  list    Lists all files with the corresponding approvers
  verify  Verifies the format of the CODEOWNERS file

Graal makes it fast:

$ time codeowners-cli
Usage: codeowners-cli [-hV] [COMMAND]
Process CODEOWNER files
  -h, --help      Show this help message and exit.
  -V, --version   Print version information and exit.
Commands:
  help    Display help information about the specified command.
  list    Lists all files with the corresponding approvers
  verify  Verifies the format of the CODEOWNERS file

real    0m0.011s
user    0m0.003s
sys     0m0.004s

More usage examples

$ codeowners-cli list help

Usage: codeowners-cli list [-fu] [-idl] [-ngi] [-cf=<codeownersFile>]
                           [-gi=<gitignoreFile>] [-o=<owners>]... [<files>...]
                           [COMMAND]
Lists all files with the corresponding approvers
      [<files>...]           Specifies the files to scan
      -cf, --codeowners-file=<codeownersFile>
                             Specify the path to the CODEOWNERS file.
  -f, --fail-on-output       Whether to exit non-zero if there are any matches.
      -gi, --gitignore-file=<gitignoreFile>
                             Specify the path to the .gitignore file.
      -idl, --ignore-dot-files
                             Whether to ignore the dot files.
      -ngi, --no-gitignore   Whether to ignore the .gitignore file.
  -o, --owners=<owners>      Filters the results by owner
  -u, --unowned-files        Whether to only show unowned files (can be
                               combined with -o).
Commands:
  help  Display help information about the specified command.

(This is formatted more nicely in the terminal).

$ codeowners-cli list

                                                                       File |               Approvers
                                                       ./CODE_OF_CONDUCT.md |           @default-team
                                                          ./CONTRIBUTING.md |           @default-team
                                                                  ./LICENSE |           @default-team
                                                                ./README.md |           @default-team
                                               ./dependency-reduced-pom.xml |           @default-team
                                         ./etc/eclipse-formatter-config.xml |           @default-team
                                                          ./etc/license.txt |           @default-team
                                                            ./jreleaser.yml |           @default-team
                                                                     ./mvnw |           @default-team
                                                                 ./mvnw.cmd |           @default-team
                                                                  ./pom.xml |           @default-team
                                           ./src/main/assembly/assembly.xml |    @devs, @default-team
          ./src/main/java/org/vincentjames501/codeowners/CodeOwnersCLI.java |    @devs, @default-team
./src/main/java/org/vincentjames501/codeowners/commands/ListCodeOwners.java |    @devs, @default-team
        ./src/main/java/org/vincentjames501/codeowners/commands/Verify.java |    @devs, @default-team
         ./src/main/resources/META-INF/native-image/native-image.properties |    @devs, @default-team
             ./src/main/resources/META-INF/native-image/reflect-config.json |    @devs, @default-team
      ./src/test/java/org/vincentjames501/codeowners/CodeOwnersCLITest.java | @testers, @default-team
                                            ./src/test/resources/CODEOWNERS | @testers, @default-team

@nielsbasjes , if you buy into this, it'd be great to merge into your project here. If not, I'll just maintain this separate project for now. Thanks for removing commons-io! It should make our binary a bit smaller!

vincentjames501 commented 4 months ago

Also, @nielsbasjes , I'm an idiot though and completely didn't see the code from CodeOwners Enforcer rule :)

vincentjames501 commented 4 months ago

@nielsbasjes , ok. More work over there is done. You can pull version 0.0.3 for a demo and even showing it off with pre-commit hooks so it can be used on any technology (such as Python projects).

Again, just let me know if this is something you want to make first class and I can archive my repo.

nielsbasjes commented 4 months ago

I'm going to check this out because I like this idea.

There are somethings I haven't touched before (like the automated releasing and publishing binaries), some things I possibly want differently (like the calculation of the next version for which I normally use https://github.com/nielsbasjes/conventional-commits-maven-release).

I'm going to try things out to learn and understand this before I take it in.

nielsbasjes commented 4 months ago

Question: Seems like quite a few of the files (like .github/workflows/release.yml) have been generated or come from a template or different project. I see very very specific things like git config --global user.email "41898282+github-actions[bot]@users.noreply.github.com"

Where did you get these from?

vincentjames501 commented 4 months ago

@nielsbasjes , I got them from the project referenced in this article:

https://engineering.peerislands.io/how-to-build-native-cli-apps-using-java-maven-graalvm-picocli-jreleaser-and-github-actions-1407693d99ff

I noticed that some things were out of date and copied a few more things from https://github.com/kcctl/kcctl that the project referenced.

I admit I'm not too sure about 41898282+github-actions[bot]@users.noreply.github.com but I do see it mentioned in several "official" places: https://github.com/actions/checkout/pull/1184

One more thing to note that you may want to peek at. When trying the GraalVM executables I noticed that on some of our larger java projects and especially large webapps/node projects the tool was taking forever. Turns out it's Files.find. It naively walks every folder even gitignored ones so when it walks a massive folder like node_modules/ or even some large target/ directories it can take forever. Just to test some things out, I found I could speed things up orders of magnitude by using Files.walkFileTree(path, new FileVisitor<>() {: https://github.com/vincentjames501/codeowners-cli/blob/main/src/main/java/org/vincentjames501/codeowners/commands/ListCodeOwners.java#L82 As this gives the ability to skip traversing ignored directories entirely. I don't like all this mutability and I'm sure it could be rewritten to be some recursive/lazy stream but wanted to get something working first.

Additionally, I noticed that you were actually finding all .gitignore files (which is correct, my original was wrong). Since I wanted to be able to use this tool to analyze multiple single files (i.e. use with pre-commit), it was FAR too costly to recursively scan for all the .gitignore files in a directory. It turns out, git already caches/knows about the .gitignore files so I'm shelling out to git to find all the .gitignore files before we start doing anything. This makes it so that if I do something like codeowners-cli src/main/foo.java it finds the .gitignore files from git (super fast) and then literally just checks this one file where the current Files.find approach will build a stream of every file in the directory.

nielsbasjes commented 2 months ago

First of all I have not forgotten about your proposal.

In general I like the idea of a commandline tool.

There are however several things needed to do this that I currently do not yet have any experience with. My current knowledge level of brew and doing binary releases is not yet at the level I'm willing to release code with. I also do not have any Mac system so I won't even be able to test everything.

In addition to that I already have too many things I want to spent time on.

It is totally fine if you do all of this and release a commandline tool in the way you have done using my library.

I have put your remarks about searching through many files in a separate issue: https://github.com/nielsbasjes/codeowners/issues/148

nielsbasjes commented 1 month ago

I have created a utility function that will walk through the tree of files and ONLY return the ones which are non-ignored by any .gitignore rule. This is not a lazy stream but simply a smart scan that gives a list with all of the files. It is in the 1.9.0 I just released.