jish / pre-commit

A slightly improved pre-commit hook for git
https://jish.github.io/pre-commit/
Other
796 stars 96 forks source link

Do not run pre-commit checks on images #262

Closed jish closed 7 years ago

jish commented 7 years ago

cc: @jfly

We have a mechanism to determine if we should run our list of checks on the files being committed. Generally these checks are pretty simplistic. For example, does the file exist, or is it a directory?

Also, in order to avoid running checks on large binary files we run a heuristic against every file currently being committed. Unfortunately, this binary check is prone to errors and is now causing performance issues.

In order to avoid this binary file test, we do some quick filtering based on file extension. If the file is an image, we definitely do not want to run checks on it. If the file is a Ruby source code file, we definitely do want to run checks on it.

This will allow us to avoid running the binary test in most cases.

Fixes #255

coveralls commented 7 years ago

Coverage Status

Coverage increased (+0.04%) to 97.326% when pulling 237c97299b5c28879390c0927e1e2df84783b699 on josh/quick-extension-filter into 96b3b435885e55964738052df476ec771641536a on master.

coveralls commented 7 years ago

Coverage Status

Coverage increased (+0.04%) to 97.326% when pulling 1ea9ba2701fd8cdff6e64bf6c962a3e1029c0e49 on josh/quick-extension-filter into 96b3b435885e55964738052df476ec771641536a on master.

coveralls commented 7 years ago

Coverage Status

Coverage increased (+0.04%) to 97.324% when pulling dd0e7066302c27123a8527f2c8ae96e125a1e768 on josh/quick-extension-filter into 96b3b435885e55964738052df476ec771641536a on master.

coveralls commented 7 years ago

Coverage Status

Coverage increased (+0.03%) to 97.317% when pulling 08f7ad3f148b278e6e7f3c67d25ce6383e077ff3 on josh/quick-extension-filter into 96b3b435885e55964738052df476ec771641536a on master.

mpapis commented 7 years ago

@jish I have experimented with the benchmark and no matter of the project size the difference was almost nonexistent

Other problem might be that on my current work project it takes ~31.5s to calculate the list, this is way to long, with:

      BINARIES = [".img", ".iso", ".dmg", ".xz", ".tar", ".rar", ".zip", ".gz"].freeze
      IMAGES = [".jpg", ".jpeg", ".png", ".gif", ".ico", ".pdf"].freeze
      IGNORED_EXTENSIONS = BINARIES + IMAGES

      SOURCE_FILES = [".md", ".rb", ".js", ".html", ".css", ".json", ".yml", ".erb", ".coffee", ".slim", ".liquid", ".sass", ".scss", ".feature"].freeze

it dropped to ~6.35s, still to long, but a lot better

mpapis commented 7 years ago

I guess we could add a lot more to the ignored list and use plugins as a information which extensions need to be included in source_files (and dropped from ignored list)

jish commented 7 years ago

"This looks good to me! One simplification I would consider: get rid of all the source_file? and binary? logic in favor of just having a list of file extensions you don't check."

In concept, I totally agree with you. Ideally, we would never have to run the binary test. Unfortunately, in the wild there are files that simply do not have extensions, and it is a much better user experience to approximate the correct behavior rather than accidentally blocking commits or throwing exceptions.

I am all for extending the list of known file extensions giving more consistent behavior and performance. But I think we should keep the binary test in place as a fallback in case new file extensions emerge that we had not anticipated.

coveralls commented 7 years ago

Coverage Status

Coverage increased (+0.03%) to 97.317% when pulling 1bacb6ee28439a8f980cec32edd21dee7795b009 on josh/quick-extension-filter into 96b3b435885e55964738052df476ec771641536a on master.