pantsbuild / pants

The Pants Build System
https://www.pantsbuild.org
Apache License 2.0
3.35k stars 640 forks source link

`regex-lint` (formerly `validate`) should allow path blocklisting #7942

Open jsirois opened 5 years ago

jsirois commented 5 years ago

Right now, content checks are restricted to path sets built up from path allow lists. Often it would be more robust to apply a content check to all files of a certain type save some small handfull. For those cases adding to a blocklist is both more concise and robust than enumerating allow listed directories.

Concretely, this sort of config today:

x-path-pattern: &path-pattern
  content_encoding: utf8

path_patterns:
  # We use several python path patterns here to avoid src/python/kythe which contains 3rdparty code
  # that we don't have license over (python proto stubs for kythe-licensed protobuf files).
  python_prod:
    <<: *path-pattern
    pattern: prod/.+(?<!__init__)\.py$
  python_source:
    <<: *path-pattern
    pattern: src/python/toolchain/.+(?<!__init__)\.py$
  python_test:
    <<: *path-pattern
    pattern: test/.+(?<!__init__)\.py$

content_patterns:
  python_header:
    # NB: We match an optional shebang line here.
    pattern: |-
      ^(#![^\n]+
      )?# Copyright © 20\d\d Toolchain Labs, Inc\. All rights reserved\.
      #
      # Toolchain Labs, Inc\. CONFIDENTIAL
      #
      # This file includes unpublished proprietary source code of Toolchain Labs, Inc\.
      # The copyright notice above does not evidence any actual or intended publication of such source code\.
      # Disclosure of this source code or any related proprietary information is strictly prohibited without
      # the express written permission of Toolchain Labs, Inc\.

required_matches:
  python_prod:
    - python_header
  python_source:
    - python_header
  python_test:
    - python_header

Could be written as follows with blacklists:

x-path-pattern: &path-pattern
  content_encoding: utf8

path_patterns:
  # We don't have license over the python files under src/python/kythe (python proto stubs for
  # kythe-licensed protobuf files).
  python_source:
    <<: *path-pattern
    pattern: (?<!__init__)\.py$
    exclude: src/python/kythe/.+\.py$

content_patterns:
  python_header:
    # NB: We match an optional shebang line here.
    pattern: |-
      ^(#![^\n]+
      )?# Copyright © 20\d\d Toolchain Labs, Inc\. All rights reserved\.
      #
      # Toolchain Labs, Inc\. CONFIDENTIAL
      #
      # This file includes unpublished proprietary source code of Toolchain Labs, Inc\.
      # The copyright notice above does not evidence any actual or intended publication of such source code\.
      # Disclosure of this source code or any related proprietary information is strictly prohibited without
      # the express written permission of Toolchain Labs, Inc\.

required_matches:
  python_source:
    - python_header
jsirois commented 5 years ago

Noting that we should be able to kill build-support/bin/check_header.py and just replace its invocations with ./pants --no-v1 --v2 validate :: once this enhancement allows us to fully configure build-support/regexes/config.yaml for our codebase.