jorisroovers / gitlint

Linting for your git commit messages
http://jorisroovers.github.io/gitlint
MIT License
803 stars 99 forks source link

Support for per-user configuration files and configuration inheritance #422

Open gschulze opened 1 year ago

gschulze commented 1 year ago

It would be nice to have a per-user configuration file (e.g. ~/.gitlint) to have a way to enforce specific rules across all projects. The per-git-repository configuration should extend/override the settings of the per-user-configuration file. For instance, one might want to enforce rules for title length across all repositories, but might want to add additional rules on a per-repository basis.

Ideally, this could be done in a similar fashion to npm: https://docs.npmjs.com/cli/v8/configuring-npm/npmrc

Would be willing to contribute.

sigmavirus24 commented 1 year ago

@jorisroovers you're the final decider here but my experience is that it is horribly confusing to have user config extend/override/blend with project configurations. If a user config is desirable to you (I don't think the value is there for adding and maintaining it), make it the fallback that's used if and only if the project config is missing

gschulze commented 1 year ago

To give a bit of context, we are currently in the process of selecting and configuring various tools to enforce consistent coding conventions across our company. Some tools have the option to configure things globally, some require to have their configuration files included in the repository. However, this makes it especially hard if you want to encourage adoption of new conventions, as it requires all configuration files to be updated in every repository. The feature I am proposing can be seen in other tools within the same domain as well:

In the case of gitlint, I would like to be able to enable conventional commits for all repositories, and specify the valid types (e.g. feat, fix, ci, ...) globally. However, the acceptable scopes (e.g. feat(frontend), fix(backend)) are specific to a particular repository, and should be configured there.

Please take this as a suggestion only, if you feel this behavior doesn't fit the project I'm perfectly fine with that. However I'd be happy to help if you consider this a useful feature.

jorisroovers commented 1 year ago

The request in #395 (adding --extend-extra-path) is similar in nature.

I understand the usefulness of this (using it myself frequently). I’d need to think a bit more on what @sigmavirus24 said - I agree it adds more complexity for both users and gitlint maintenance. Arguably, there’s already quite a bit of complexity with gitlint’s existing configuration precedence: https://jorisroovers.com/gitlint/configuration/#configuration-precedence

If we were to implement this, I’d definitely want to figure out the code design up front to ensure we do it right.

Long story short, I’m inclined to do this, but would like to punt on this until I have more time so I can be involved in the actual implementation.

sigmavirus24 commented 1 year ago

So for what it's worth, Flake8 had user config. It's not dissimilar on it's behavior to gitlint. The user file had lowest precedence. The project config. Then CLI options. The problem we ran into was users expecting to see errors reported but not seeing that and then seeing them elsewhere (e.g., CI). The problem was always the user config file. It caused so much trouble for users that we accepted the pain of removing it. We will never add it back. It's not worth the headache.

Like I said, the only thing that could have made that marginally less confusing would have been to ignore the user file if a project file was found.

jorisroovers commented 1 year ago

So for what it's worth, Flake8 had user config. It's not dissimilar on it's behavior to gitlint. The user file had lowest precedence. The project config. Then CLI options. The problem we ran into was users expecting to see errors reported but not seeing that and then seeing them elsewhere (e.g., CI). The problem was always the user config file. It caused so much trouble for users that we accepted the pain of removing it. We will never add it back. It's not worth the headache.

Ok, that’s definitely an interesting datapoint @sigmavirus24 - I think I’m undecided now.


Thinking out loud (and documenting other solutions).

Git solves this using [include] sections.

[include]
    path = subdir/gitconfig

Note that its include logic is reversed in that it won’t read repository level .gitconfig files by default, only the user’s ~/.gitconfig and then the user has to include repo-level .gitconfig.

Applied for gitlint, the repo-level .gitlint could include user-level config:


# This is just regular gitlint [general] section, nothing special here
[general]
ignore=body-is-missing,T3

# This is the relevant section #############
# Include user config, overrides anything that came before it
[include]
    path = ~/.gitlint

# Includes could be named to allow multiple of them
[include:myinclude]
    path = ~/.gitlint2

# Includes could point to a local .gitlint-extra that is gitignored and/or symlinked by the user to their own .gitlint config
[include:extras]
    path = .gitlint-extra

# One upside is that this could allow the user to specify whether to override of extend 
[include:myextends]
    path = .gitlint-extends
    mode = extend # default = override

############################################

# Rest of gitlint file resumes here, overrides any includes before it

[title-max-length]
line-length=80

Of course, there’s a few major downsides/shortcomings:

  1. The repo-level .gitlint is now dictating where the user-level .gitlint is stored. Symlinks can bring some alleviation, but not great.
  2. The user is dependent on the repo owner to add this section. Can potentially be solved by reading GITLINT_INCLUDE_PATH and GITLINT_INCLUDE_MODE env vars that serves the same purpose.
  3. Different repos might specify conflicting user-level paths.

From an implementation perspective this also is also non-trivial (not hard, just some work) as this is not something configparser supports out of the box. We’d also have to ensure precedence is correct (includes overwrite things before it, not after it).

While I can see the usefulness of includes in general (this would also solve #395), I’m not sure this is the path forward (it also doesn’t fully solve the original question). I’m interested to get your feedback.

The problem we ran into was users expecting to see errors reported but not seeing that and then seeing them elsewhere (e.g., CI)

And of course, this problem always remains to some extend. I do think the explicit use of [include] might be better than doing so implicitly.

gschulze commented 1 year ago

Hmm, using [include] to reference files outside the repository doesn't seem right to me. What about just having a flag root = [true|false] like in .editorconfig controlling whether the configuration file is the outermost one? This way you could just traverse up the parent directory hierarchy until a file with root = true is found and stop there, or have root = true in the file residing in the repository itself to have a completely isolated configuration.

sigmavirus24 commented 1 year ago

Hmm, using [include] to reference files outside the repository doesn't seem right to me. What about just having a flag root = [true|false] like in .editorconfig controlling whether the configuration file is the outermost one? This way you could just traverse up the parent directory hierarchy until a file with root = true is found and stop there, or have root = true in the file residing in the repository itself to have a completely isolated configuration.

So traverse to endlessly even to /? That kind of path traversal is a classic CVE waiting to be exploited. Add in a custom user level rule and I'm sure you could use that to exploit many things since custom rules are just python it could do anything.

Besides the horrible security posture, what happens then with submodules. If I'm in a submodule with a current gitlint config ostensibly root would default to false. That would apply the submodule config on top of the parent repo config. That would be confusing and surprising.

gschulze commented 1 year ago

@sigmavirus24 I am not sure I understand your line of reasoning regarding the attack vector. This would imply my system had been severely compromised, since normally I am in control of which files I put in what location. In that case I'd assume there are plenty of other, more straightforward ways of compromising the contents of locally checked-out git repositories. However, I am not a security researcher and hence do not claim to have any expertise in that domain.

sigmavirus24 commented 1 year ago

I am not sure I understand your line of reasoning regarding the attack vector. This would imply my system had been severely compromised, since normally I am in control of which files I put in what location.

You don't install plugins for tools you use without reading in detail the source code? Do you read the full source every time you upgrade?

There have been numerous documented issues where someone has taken over a maintenance of a package, bought the package name, or hijacked it and uploaded malicious code. If you use one, or you contribute to another project using one, you're opening your system up to arbitrary traversals as soon as that malicious version reaches you.

Just because you think you know what you're using, doesn't mean you always will. You need to program defensively at every layer, that means tools like gitlint need to consider these scenarios and users should as well