Open richardebeling opened 2 years ago
Yeah, this is highly related to https://github.com/PyCQA/isort/issues/1895 and probably resolved by https://github.com/PyCQA/isort/pull/1900.
It sounds like your specific issue is:
node_modules
--skip-gitignore
, it actually runs slower because it reads every file then filters afterwards--skip-gitignore
, it just takes too longJust encountered this same problem with isort 5.10.1. I have a data
directory with millions of files. My .gitignore
includes /data/
as one of the directories to skip. I see the following performance metrics:
$ isort --skip-gitignore . # 23 min
$ isort --extend-skip data . # 43 sec
If I delete the data
directory, isort .
takes < 2 seconds. Both 23 min and 43 sec are prohibitively long, I usually end up just not using isort and manually sorting my imports 😢
Description
When starting isort with
--skip_gitignore
, directories that should not be traversed because they are configured inskip
will be traversed.From what I've seen, this is purely a performance problem, so all files that should be skipped will be filtered out at some later point.
Instructions to reproduce
Minimal setup (note: "node_modules" is in the default skip values.)
On a side note: Add more files, and the node_modules folder will be queried more often:
adding
node_modules
to a.gitignore
doesn't help, even though the directory should then be double-ignored by isort (this is #1895).Environment & Versions
Run on ubuntu21.10 with python3.9.7. Actual syscalls probably depend on the python interpreter, if you can't reproduce, I'll probably be able to figure out what python code causes the access.
Related Issues
Closely related to #1895, but if I undestand that issue correctly, it does not talk about
skip
at all, but just files from .gitignore. ~It seems to me that the current PR for #1895, #1900, would fix this.~ Possibly related #1762 and #1777, both read as if they were fixed.Background
Yesterday, I realized that
isort
takes >90 seconds to run on one project where it processes 107 files with approximately 10k lines of code. Turns out, it doesn't need 90 seconds to process, but rather 89s to traverse what feels like the entire file system, only to then do its actual work in ~1s:which is absurd, considering that
black
also finds the code files and finishes in 3 seconds. I wasn't able to getisort
to skip reading directories it really doesn't need to access (venv, node_modules, web server static files, ...)