Closed adamehirsch closed 4 years ago
From a prior art perspective, flake8
has this same issue (marked: WONTFIX) -- that said, I personally think the decision in flake8
is incorrect/inconsistent and that implementing this is a good idea :)
I'll wait for the black maintainers to indicate whether they think it's a good idea; no sense writing a PR for something that'll be rejected on concept.
--include=
and --exclude=
are only consulted for recursive search, not for files passed on the command line.
How is the decision of flake8 inconsistent here? The rationale is rather simple: by default, we exclude some paths and only include some file extensions in recursive search. But if you specifically give us a file path on the command line which doesn't match the file extension or would otherwise belong to the exclusion list, you probably know what you're doing.
I do agree that interfacing with CI and editors is surprising in this case so I'm not downright rejecting changing this. But I need to carefully consider whether there are backwards compatibility disasters hiding in changing this. And even if we agree to do this, what does it mean for defaults in --include=
? Should I reject non- .py/.pyi files from now on unless somebody clears --include=
on their call?
This is not going to be straight-forward to change but let's try to figure something out. Anthony, what would you suggest?
ah let me clarify -- black
and flake8
currently have the same behaviour here (they are consistent).
Maybe it's a bit snowflakey but intuitively it makes a lot of sense for me to apply --exclude
even if passed on the commandline. This simplifies a lot of editor configurations, pre-commit, and even just black foo/*.py
. I don't think --include
should be applied except in the recursive case however.
That said, there's certainly arguments in both directions -- it may very well be simpler to only apply it during the recursive routines.
the one thing I usually point at here is "pre-commit is better at running your linter than your linter is" because it can take advantage of a few things:
pre-commit
knows which files are part of your version controlThough for tools that often means you have to configure both pre-commit exclusion and tool exclusion and keep them in sync (or a superset / subset of each other). It would be nice to only configure this in one place and for most that usually means "configure it using the tool's configuration". But then you run into OP's issue :)
My confusion stemmed from the following line from black's own readme on pre-commit:
Avoid using args in the hook. Instead, store necessary configuration in pyproject.toml so that editors and command-line usage of Black all behave consistently for your project.
I know this specifically says "args", but I took this to mean "pyproject.toml is preferable to pre-commit's yaml when both have a similar config." It made sense at the moment (without having fully grasped the concept of how everything worked), that since black .
worked with the exclude in black's config, that it should work with pre-commit run black
as well.
If the answer is "exclude will only ever exclude on recursive searches" that's fine, but the readme should note that in the pre-commit section ("pre-commit passes specific files, not a recursive search, so you should exclude files in pre-commit as well"). But I would fully support some sort of "--hard-exclude" that is a definitive "black will not run on these files, no matter what"
We'll definitely hard-exclude things that are .gitignore'd (see #475). Other than that, I don't think there's anything actionable here. If I pass a file explicitly to the tool, I expect it to be acted on.
What I think we could do instead is to convince @asottile to let pre-commit run black
on the entire Git repo on every commit. The main reason it's not doing so is performance as far as I can tell. But Black does cache its executions so it won't even touch files that are already well-formed. Plus, then it will correctly follow exclusion and inclusion lists. This would minimize configuration for users, too.
Would a PR with an update to the readme be warranted then?
Avoid using args in the hook. Instead, store necessary configuration in pyproject.toml so that editors and command-line usage of Black all behave consistently for your project. See Black's own pyproject.toml for an example.
This heavily implies that pre-commit should not have any config (although it admittedly just calls out args
specifically, as a first time user this was confusing). Some clarification that pre-commit will call files explicitly, thus will ignore black's exclude
list would be nice.
I'd like to wait for what Anthony thinks about it. If he agrees that we should change the hook setup so that Black is ran on the entire repo every commit, then we will do just that and the README then remains fine.
Otherwise, yeah, we will have to explicitly call out in the README that --exclude= in pyproject.toml is not used by pre-commit. I would like to avoid this.
first, it's possible to enable this behaviour but I don't think it's desirable (as it sidesteps the benefits of the framework):
- id: black
args: [.]
pass_filenames: false
always_run: true
pre-commit is (generally) better at running linters than linters themselves are, here's a couple of reasons why:
pre-commit
knows about what files are in version control and will never run against files which aren.t There's no need to parse / worry about .gitignore
/ exclude .git|.tox|venv|...
, etc. / recurse / etc.pre-commit
knows how to lint extensionless executables with a conventional shebang (#!
)pre-commit
knows when or when not to run the linter (only passing filenames which change, not even executing at all when there are zero files which change)An example of a case where always running black on all the files is (very) wrong is during a merge conflict resolution. pre-commit will only execute a linter on files which conflict or are manually changed avoiding the headache of waiting for black
(or other linters) to run across every file that changed in the upstream (and potentially dealing with other people's mistakes as a punishment for merging).
(gotta run, hope this succinct reply is enough, if not I can elaborate / link some more prose on this -- hope it helps!)
When using an editor integration that calls black with a changed file's path (which it could do automatically, such as on a post save action), this behaviour also means that it would reformat the file even if it's excluded by the config.
So I would be in favor of either changing the default to always consider the ignore/exclude rules, or to include an option to do so even when a full path is provided.
Hi everyone,
I also run into this problem with multiple editors and this behavior was surprising to me, as a user.
IMHO you don't want to have multiple configurations stating the same files to exclude: pyproject.toml, pre-commit (not everyone using the tool named pre-commit
), VIM, PyDev, IntelliJ etc.
We have many tools in the team so it's essential one configuration will be used by everyone.
I think that in terms of usability, surprising the user is never a good idea. We can add two flags (tentative naming) to help black behave nicely when an excluded file is given to black as an argument:
-f/--force
to force formatting an excluded file, return 0 (or whatever we return from the formatting)--ignore-excluded
which means black silently ignore the excluded files, formatting the other files (if any) and return 0 (assuming the rest was fine)If this change of behavior is agreed upon, I can try and create the PR.
Any progress/decisions on this?
I agree with Anthony that pre-commit is in a better position to figure out when to run Black on what, so if we could somehow get away with only ever running Black through pre-commit, that would provide a way to solve this problem.
I don't think it's OK to assume Black is always going to be called through pre-commit though, so we do need a separate include/exclude mechanism (and unfortunately we do need to worry about parsing .gitignore
et al) for when it's called directly from the command line or through editors.
As for the performance concerns around running black .
on the entire repo every time you commit: after the first run, it should be O(changed files) instead of O(repo) because of Black's cache.
So I think all in all we should encourage people to configure Black with the following in their .pre-commit-config.yaml
- id: black
args: [.]
pass_filenames: false
always_run: true
please don't suggest that per above, thanks
even with black's cache it's going to be the wrong thing during merge conflicts and you're back to linting files that aren't checked in and you have to do filesystem traversals which themselves are pretty slow
given how frequent this comes up I'm considering changing flake8
's behaviour to honor flake8 foo.py --exclude foo.py
to be a noop (as silly as such a command is) and perhaps if that goes well black should do the same
plus, even a trivial invocation of black
(with no files) takes ~250ms which would be 0ms in the case when there's no files to run
https://github.com/psf/black/issues/438#issuecomment-495238640 is the closest to a proposal here, but I'm not sure that we need extra parameters. I think all we need to do is change the behavior of --exclude
:
Desired Behavior:
--include
is used when recursively searching a path (EXISTING)--exclude
is always used on each filepath, no matter whether the filepath was explicitly passed in or discovered (NEW)Use cases:
black --exclude '' excluded_file.py
pre-commit
can run black
with every changed file, and black
will exclude what should be excluded: black modified_included_file.py modified_excluded_file.py
.The latter use case is important since during merge conflicts, we will only run on the files that the user edited, rather than on every file that anyone has edited.
weird I thought I mentioned this but I guess not, if you're only invoking through pre-commit
it's usually better to use pre-commit
's exclude: ...
pattern:
repos:
- repo: ...
rev: ...
hooks:
- id: black
exclude: ^testing/test_data/
or if you're globally excluding
repos:
exclude: ^vendor/
- repo: ...
rev: ...
hooks:
- id: black
This works for me (in my limited use cases):
diff --git a/black.py b/black.py
index 05edf1a..05ba5af 100644
--- a/black.py
+++ b/black.py
@@ -441,7 +441,23 @@ def main(
)
elif p.is_file() or s == "-":
# if a file was explicitly given, we don't care about its extension
- sources.add(p)
+ try:
+ normalized_path = "/" + p.resolve().relative_to(root).as_posix()
+ except ValueError:
+ if p.is_symlink():
+ report.path_ignored(
+ p, f"is a symbolic link that points outside {root}"
+ )
+ continue
+
+ raise
+
+ exclude_match = exclude_regex.search(normalized_path)
+ if exclude_match and exclude_match.group(0):
+ report.path_ignored(p, f"matches the --exclude regular expression")
+ continue
+ else:
+ sources.add(p)
else:
err(f"invalid path: {s}")
if len(sources) == 0:
I have added a PR following @chebee7i proposal, which I think is the most sensical https://github.com/psf/black/pull/1032. I think that black needs to work well when integrating with other tools (editors, pre-commit hooks) for larger adoption. this concerns probably were less of a thing some years ago when flake8 was developed, but if we look at other modern tools and ecosystems (node: prettier, eslint, husky) this is the expected behavior.
this is the 4th top commented issue in the repo, it would help a lot of people if some attention could be spared on this. There is an open PR, so my question is why this is not moving forward? a) simply no time to look into this? b) won't fix? c) something else?
They are all acceptable answers :) but I'd like to set some expectations for myself
@itajaja well for one your open PR has no tests and has a merge conflict -- I also don't think there's been any concrete proposal / agreement on the way to move forward yet
it has a merge conflict because is a month old, and I wouldn't keep rebasing if there was no interest. For tests, as I mention in the PR:
I Have not added or amended tests yet, but if we agree on the approach, I can work on them
so, yes, it was a way to throw a more concrete proposal and discussion. Sometimes is better to actually do a PR than discuss on the issue. you say that there hasn't been a concrete proposal, I'd say it doesn't get more concrete than a PR 😄
so, yes, what's missing is an agreement, and for that we need core maintainers and contributors to chime in. I am just trying to bring attention on an issue that seems fairly popular.
Thanks!
asking again. I think the ball is in the core contributor's court to decide what to do, or at least give an update. Any answer is legitimate (won't fix, no time for this, alternative proposal) but at least an answer will help people set expectations :v: This is something that needs to come from core contributors. What we can do is give awareness that this is a problem for many people, but we need guidance in order to solve this.
Hello all, I'm using VSCode and would also like to see this option come to Black for a more correct format-on-save behaviour. Maybe there could be an optional boolean flag like --force-hard-exclude
(similar to isort's --filter-files
) which flips the behaviour to unconditionally honoring the exclude list regardless of the command line arguments.
And thanks for the fantastic tool! :slightly_smiling_face::+1:
it's really sad to see no answer at all from maintainers. we simply forked with the least intrusive modification and moved on with our lives...
To echo a previous comment, blacks PyCharm editor integration suggests setting up with the FileWatchers plugin to run black whenever a file changes. Because --exclude
does not work when passing a file directly, every time you install something into a .venv
in your project directory, black goes through every file and formats it 😮 !
FileWatchers doesn't provide a way to exclude directories from being watched (an issue in its own right), but it leaves anyone integrating with PyCharm and Black at an impasse. Let black format every installed package, clocking your CPU on pip installs, or don't use the watcher integration and use pre-commit instead.
I understand the hesitancy to introduce breaking changes by modifying this behavior. It seems the only work around is to add a CLI feature flag that opts into the new behavior, and warns people relying on the existing "format direct file paths even if excluded behavior" that it is deprecated and the default will change in a future release.
@itajaja could you provide a link to the fork?
@audiolion https://github.com/psf/black/pull/1032
Ah ok so you didnt publish the fork to PyPI but are doing an install from github directly?
yep
FileWatchers doesn't provide a way to exclude directories from being watched (an issue in its own right)
I think you can configure a scope on your file watcher configuration that excludes the .venv dir (and any others that shouldn't be touched).
@c0state for the life of me I couldn't find way to do that, or docs from jetbrains on it.
@c0state for the life of me I couldn't find way to do that, or docs from jetbrains on it.
See https://www.jetbrains.com/help/idea/settings-scopes.html. All file watchers have a scope. The default is "Project Files" I believe which will probably format stuff you don't want. I usually set one up that includes my project files and then explicitly excludes venv/.venv
, build, etc. dirs and such. If all your projects look the same (structurally), you can make this scope global and use it in all your projects, otherwise you can set this up manually for each project.
This is not a thing to discuss, but it is simply a bug of black. If I have a repo containing other file types than Python, I can't add a pre commit hook executing black on all files to be committed. That's because black can't format e.g. markdown files.
Usually, a pre-commit hook is done with something like:
git diff --name-only --diff-filter=ACM
Now I could add additional greps to only include py files, etc., but I already specified all of that in my pyproject.toml. I think in our project we must now maintain a fork :-(
@steinsag if you use something like pre-commit it's easy to configure black to run only on python files.
For those in need of getting a fixed version, my colleague created a patched version https://github.com/axiros/axblack, which is also available on pypi https://pypi.org/project/axblack/. Please note, axblack uses single quotes instead of double quotes.
I'm considering supporting the idea of applying --exclude
and --include
for files explicitly passed on the command line.
There's one big concern with this approach I've seen above: black file
doesn't do anything because file
is either not included (in either pyproject.toml
or the default \.pyi?$
pattern) or it's excluded. I see this as a problem in these two cases:
The issue is that this file-filtering configuration is implicit to some degree. It would be OK if everything is explicitly on the command line:
black --include '\.py$' file.pyi
not working is understandableblack file.pyi
with a pyproject.toml
containing the above include not working is mildly annoyingblack bin/myscript
relying on the defaults not working is badbin/myscript
with the default config, and that not working is really badI think this is a big problem because of --include
: the defaults for it cannot capture (what I think is) a fairly common use case. --exclude
is probably fine IMO.
So how about being pragmatic about this, swallowing the inconsistency and saying: --include
applies only to traversing directories (like today) and --exclude
is applied to all command line arguments? @ambv have you changed your opinion?
I echo your concern -- it's probably less surprising if only --exclude
is applied
Not to muddy the waters any. I see it looks like a consensus is forming around having the exclusion rule also apply to file passed in via the command line (and I hope this, too, will go for flake8
).
But thought I'd point out isort
"fixed" this by adding another flag filter_files
, which says
Tells isort to filter files even when they are explicitly passed in as part of the command. This is especially useful to get skip and skip_glob to work when running isort through pre-commit.
I don't think @ambv has weighed in on his opinion of another command-line flag (which I see has been brought up), which solves this problem in a backwards-compatible way (since @ambv did mention backwards-compatibility is something to not take lightly).
I like the idea of --filter-files
. Let's name it --force-exclude
.
@itajaja, thanks for persevering with this issue. If you made your pull request to use --force-exclude
, I will gladly merge it.
🙏 I'll take a stab!
@ambv and here's the stab 🔪 https://github.com/psf/black/pull/1032 All tests pass, so we are pretty confident it's backward compat. If this looks good, I'll add tests, but let me know what tests you think would be useful
uhm...
Can we get the above PR reviewed? The change seems pretty small and it seems like it would benefit a fair number of folks.
@ambv Hey Łukasz :) Sorry to bother you - we actually hit that very specific issue and taking a stab at this PR would really help us out!
@vishalkuo Seems it's time to fix the pylint build job probably with a pylint single line exception first :wink:
black.py:274:1: C901 'main' is too complex (20)
Would be great to have some solution to this issue soon. It's quite annoying.
black --check --exclude=.migrations/ --diff app-code > black.diff This seems to work for me.
We'll definitely hard-exclude things that are .gitignore'd
https://github.com/psf/black/issues/438#issuecomment-424803691
~Currently due to a bug (#1572), this works as described 🙃 . Even if you explicitly pass a file to Black, it will still ignore the file if it's gitignore'd. Unfortunately, this bug also breaks the behaviour of --exclude
, causing it to act like --force-exclude
so it has to be fixed.~ This is no longer the case as #1591 fixes this bug, but the rest of this comment still applies.
@ambv are you still in favor of having Black always refuse to format files that are gitignore'd, or should we stay with current ~go back to stable
~ behaviour where .gitignore only applies to files found recursively? That idea of yours goes back almost two years ago, so you could be no longer in favor of it.
Operating system: OSX Python version: 3.6.2 Black version: black, version 18.6b4
The problem: certain directories in our repo contain generated python code that we don't want black to change. We've configure our repo to run black via pre-commit. Pre-commit invokes black with a list of changed files on the command line, and black's exclude regex does not work against those files and paths.
i.e.
This makes us sad, since we've carefully put exclusion regexes into our pyproject.toml and black doesn't honor them when pre-commit calls it. Instead, we're having to workaround by configuring pre-commit to skip that path:
The behavior we'd like to see is that black's exclude regex would apply even when full file paths are listed on the commandline. I'd be happy to try for a PR if this seems like desirable behavior to anyone else...