jorisroovers / gitlint

Linting for your git commit messages
http://jorisroovers.github.io/gitlint
MIT License
806 stars 99 forks source link

Parsing git message trailers #377

Open jorisroovers opened 1 year ago

jorisroovers commented 1 year ago

Talking about Co-Authored-By:, it could be useful for M1: author-valid-email to also check those trailers and the committer? You can extract these via something like git show --no-patch --pretty='format:%(trailers:key=Co-Authored-By,separator=%x0A)%x0A%an <%ae>%x0A%cn <%ce>' HEAD.

Originally posted by @webknjaz in https://github.com/jorisroovers/gitlint/issues/373#issuecomment-1322153613

jorisroovers commented 1 year ago

Currently we only have 1 meta rule (denoted by rule ids starting withM) which is M1:author-valid-email, the idea has always been to add more meta rules at some point, most obviously M2: author-valid-name. We can create a separate issue for this, happy to take a PR for it as well :-)

If I understand correctly, @webknjaz I think what you’re proposing here is different though. I believe you’re suggesting has 2 distinct parts:

  1. Parse out the commit message trailers and exposes those in the commit object so they can be used by rules (including user defined rules).
  2. Add a new meta rule that allows for validating those trailers, something like:

    [trailer-match-regex]
    key=Co-Authored-By
    regex=^Dependabot
    
    # Named Rules can be used to have multiple of these trailer-match-regex rules
    [trailer-match-regex:SignedOffBy]
    key=Signed-off-by
    regex=^Dependabot
    
    # Perhaps we need an extra option 'trailer-required' to indicate the behavior when the trailer is not present
    [trailer-match-regex:SignedOffByRequired]
    trailer-required=true # hard fail when no Signed-off-by trailer is present
    trailer-required=false # skip rule if no Signed-off-by trailer is present
    key=Signed-off-by
    regex=^Dependabot
webknjaz commented 1 year ago

Sounds about right, except I was specifically thinking about the metadata that can be interpreted as contributors. For example, GitHub and other Git hostings show everybody listed in the built-in Author and Committer fields, along with all the entries of Co-Authored-By as authors of that commit. So there's basically 3 sources of authorship information. I think it only makes sense to validate all of them in the existing rule, instead of just one field — any of them can be broken, not just one field only.

Note that there may be an unlimited amount of Co-Authored-By entries in a single commit.

jorisroovers commented 1 year ago

Interesting use-case, I can definitely see the usefulness of this. I'm not sure whether it would be better to implement this as a new rule though, rather than overloading the existing one with a new meaning/behavior.

In order for this to happen, gitlint would also need capture the Committer information, only the Author information (as it does today).

To summarize for when we'd get around to implement this, there's a bunch of work suggested here:

  1. Extend commit object: Parse committer info (%cN and %cE) in addition to the existing author information
  2. Extend commit object: Parse trailers
  3. Implement a meta rule for validating trailers
  4. Implement the author-valid-name rule
  5. Implement new authors-all-valid-email, authors-all-valid-name rules (rule name subject to change)

I think it makes sense to split these out into separate issues once we start work on this.