numtide / treefmt

one CLI to format your repo
https://treefmt.com
MIT License
534 stars 32 forks source link

Excluded files reported as unmatched #317

Open jaen opened 3 weeks ago

jaen commented 3 weeks ago

Describe the bug

When a file is explicitly excluded from linting, be it with a global or formatter-specific exclude, it gets reported when running treefmt with a no formatter for path. While it's certainly useful to know if you didn't cover some file by accident, you also might not want to format certain file (say, patches) and getting reminded about them in the output is superfluous noise.

To Reproduce

Steps to reproduce the behavior:

  1. set up formatting,
  2. exclude a file,
  3. observe WARN format: no formatter for path: some/excluded/file in the output.

Expected behavior

I would expect not being informed about files I don't plan to format. While it might make sense to warn by default for files that fall through the includes/excludes filter — say, you format/lint different files of the same type differently for whatever reason (I dunno, typo checking for different languages in your md-based docs), but you don't want any to be uncovered by accident — it also makes sense to never want to format certain files and not get nagged about them.

I'm not sure what's the best solution here — maybe don't want about excludes coming from global.excludes or maybe have an global.ignores config value — but anything would be welcome.

System information

I used nixpkgs-unstable#treefmt2 which at this point in time points to 2.0.0-rc4.

brianmcgee commented 3 weeks ago

@jaen you can control the log level:

  -u, --on-unmatched=warn            Log paths that did not match any formatters at the specified log level, with fatal exiting the process with an error. Possible values are <debug|info|warn|error|fatal>.

I've set the default to WARN as I think it's useful. It should only output when the cache is cold. Once a file has been evaluated it shouldn't output again unless the formatters change or the config changes.

You can demote it to info level with -u info so you'll only see it with -v or debug with -u debug to only see it with -vv.

jaen commented 3 weeks ago

@brianmcgee I don't disagree that it can be useful, it's just that you might really want to ignore some files and then it becomes (at least lines relating to those files) noise.

I have noticed the --on-unmatched=level parameter, but it doesn't quite solve the issue IMO — you might still want to get informed about files you have not covered by accident, while not getting noise about files you know will never be linted. So this option is not granular enough.

I think cache also doesn't quite solve the issue, unless you are expected to actually store the cache between CI runs? Because unless you do, the cache will be cold all the time in CI and you'll get the noise for those files all the time.

Also, regarding CI I'd expect you might want CI to fail if some file was not covered with formatting/linting by accident, so you'd want to set --on-unmatched=fail. But if you can't ignore certain files from formatting altogether, you won't be able to do that as even excluded files will make treefmt fail unless you start resorting to unsavory things such as using true as a formatter for files that should be always ignored.

Does it make any sense to you?

brianmcgee commented 3 weeks ago

I just re-read your description above, and I realise I completely missed the first point:

When a file is explicitly excluded from linting, be it with a global or formatter-specific exclude, it gets reported when running treefmt with a no formatter for path

If it has been excluded, then it should not report it, but we don't properly account for this.

jaen commented 3 weeks ago

Right, but writing my response above I realised it might not be as simple as just "if it has been excluded, then it should not report it". For example see this somewhat contrived example:

[formatter]
[formatter.language-lint-en]
command  = "language-lint"
excludes = [ "docs/**" ]
includes = [ "**/*" ]
options  = [ "--language", "en" ]

[formatter.language-lint-docs-en]
command  = "language-lint"
excludes = [ "docs/en/**" ]
includes = [ "*.md" ]
options  = [ "--language", "en" ]

[formatter.language-lint-docs-pl]
command  = "language-lint"
excludes = [ "docs/pl/**" ]
includes = [ "*.md" ]
options  = [ "--language", "pl" ]

where you want to lint every file in your project in English (let's assume the tool is smart enough to lint code comments or whatever), but for docs you want to of course lint them in the language they were written in.

If we just not report excluded files then — in case of this config — you won't get warned if you added German docs, but forgot to set up language linting for this.

That's why I think it must be a bit more complex than just ignoring excludes. Two solutions (as mentioned in the OP) that come to my mind is either global excludes being special or ignoring files being separate from includes/excludes and they should only decide which formatter (if any) applies.

I'm not sure that global excludes being special would work well, as then you might start running into specificity issues — say, you exclude something, but you want to lint some subdirectory of that, except this one specific file; if you add another new file should you get get a warning (because you included this subdirectory) or not (because it's in global excludes).

As such, separating ignoring from inclusion/exclusions seems like a better option, but maybe I'm missing something?

brianmcgee commented 3 weeks ago

Thanks for the detailed breakdown, I agree it's not so straightforward :thinking:

I'll try and find some time in the next day or so to point my brain at this properly.

brianmcgee commented 2 weeks ago

I'm undecided how to resolve this 'completely' but in the meantime I've gone ahead and separated the global exclude processing https://github.com/numtide/treefmt/pull/318

This should improve things in the short-term @jaen

jaen commented 2 weeks ago

Thanks! Adding a dummy true linter wasn't really a big deal with treefmt-nix, but at least the config is cleaner now.