dorny / paths-filter

Conditionally run actions based on files modified by PR, feature branch or pushed commits
MIT License
1.98k stars 230 forks source link

Files changed in a PR are too many to be fully detected by dorny/paths-filter #227

Open hwhsu1231 opened 4 months ago

hwhsu1231 commented 4 months ago

Problem Description

Recently, I found that when using dorny/paths-filter@v3 (currently https://github.com/dorny/paths-filter/commit/de90cc6fb38fc0963ad72b210f1f284cd68cea36), if a PR contains too many files changed, it seems that dorny/paths-filter@v3 will miss some of the files during the filtering process, leading to incorrect filtering results.

What happened?

Over the past few months, I have been trying to create an automation project that localizes CMake documentation.

Workflow to create a PR

First, I wrote a workflow file named ci-sphinx-update.yml, which essentially executes the following steps in order:

  1. Generate/Update .pot files from running sphinx-build command with gettext builder
  2. Generate/Update .po files from .pot files by running msgcat or msgmerge command
  3. Create a PR from a feature branch to the master branch by peter-evans/create-pull-request@v6

Thus, theoretically, if this workflow runs from scratch to generate .pot/.po files, the generated PR should include both .pot and .po files. This is indeed what appears from the output logs of peter-evans/create-pull-request@v6. Below is a part of the log extracted from my output. From it, we can see that this PR had a total of 4182 files changed.

Click to expand the log of 'peter-evans/create-pull-request@v6' ``` [343874be-8420-4a7d-aea3-c36361be72f7 3a5b420f6] pot(3.1): Update pot from Sphinx Author: docs-l10n[bot] <157310748+docs-l10n[bot]@users.noreply.github.com> 4182 files changed, 245931 insertions(+) create mode 100644 l10n/3.1/crowdin.yml create mode 100644 l10n/3.1/po/ja/LC_MESSAGES/command/add_compile_options.po create mode 100644 l10n/3.1/po/ja/LC_MESSAGES/command/add_custom_command.po create mode 100644 l10n/3.1/po/ja/LC_MESSAGES/command/add_custom_target.po create mode 100644 l10n/3.1/po/ja/LC_MESSAGES/command/add_definitions.po ... ... ... create mode 100644 l10n/3.1/pot/variable/PROJECT_VERSION_TWEAK.pot create mode 100644 l10n/3.1/pot/variable/UNIX.pot create mode 100644 l10n/3.1/pot/variable/WIN32.pot create mode 100644 l10n/3.1/pot/variable/WINCE.pot create mode 100644 l10n/3.1/pot/variable/WINDOWS_PHONE.pot create mode 100644 l10n/3.1/pot/variable/WINDOWS_STORE.pot create mode 100644 l10n/3.1/pot/variable/XCODE_VERSION.pot create mode 100644 l10n/3.1/version.json ```

Workflow to check status

Next, I also wrote a workflow file named ci-check-status.yml, which uses dorny/paths-filter@v3 to filter .pot files as follows:

- name: Check for *.pot files changed
  id: filter
  if: ${{ steps.evprt.outputs.VERSION != '' }}
  uses: dorny/paths-filter@v3
  with:
    filters: |
      pot:
        - 'l10n/${{ steps.evprt.outputs.VERSION }}/pot/**'

However, when ci-check-status.yml was triggered and attempted to filter the .pot files changed in the PR, I found that it nearly missed all .pot files, thus returning a false result. Below is a part of the log extracted from my output. From it, we can see that dorny/paths-filter@v3 only detected 3000 fiels changed.

Click to expand the log of 'dorny/paths-filter@v3' ``` Run dorny/paths-filter@v3 with: filters: pot: - 'l10n/3.1/pot/**' token: *** list-files: none initial-fetch-depth: 100 Fetching list of changed files for PR#446 from Github API Invoking listFiles(pull_number: 446, per_page: 100) Received 100 items [added] l10n/3.1/crowdin.yml [added] l10n/3.1/po/ja/LC_MESSAGES/command/add_compile_options.po [added] l10n/3.1/po/ja/LC_MESSAGES/command/add_custom_command.po [added] l10n/3.1/po/ja/LC_MESSAGES/command/add_custom_target.po ... ... ... [added] l10n/3.1/po/zh_TW/LC_MESSAGES/variable/CMAKE_SHARED_LINKER_FLAGS.po [added] l10n/3.1/po/zh_TW/LC_MESSAGES/variable/CMAKE_SHARED_LINKER_FLAGS_CONFIG.po [added] l10n/3.1/po/zh_TW/LC_MESSAGES/variable/CMAKE_SHARED_MODULE_PREFIX.po [added] l10n/3.1/po/zh_TW/LC_MESSAGES/variable/CMAKE_SHARED_MODULE_SUFFIX.po Received 0 items Received 0 items Received 0 items Received 0 items Received 0 items Received 0 items Received 0 items Received 0 items Received 0 items Received 0 items Received 0 items Received 0 items Detected 3000 changed files Results: Filter pot = false Matching files: none Changes output set to [] ```

Conclusion

From the output logs of the two workflows, I infer that because the PR contains too many files changed (a total of 4182 files changed), dorny/paths-filter@v3 is unable to load all the files changed (it detected a maximum of 3000 files changed).

My questions are as follows:

  1. Is my inference correct?
  2. If so, how can I solve this problem?
  3. If it cannot be solved, is it considered a bug?
  4. If it's indeed a bug, I hope it could be fixed as soon as possible.
hwhsu1231 commented 4 months ago

Might be related to: https://github.com/orgs/community/discussions/57830

kelchm commented 3 months ago

I ran into this as well -- I've not had time to dig into it very much yet, but I did observe that falling back to the git-based change detection sidesteps the issue.

To do that, simply set the token param to an empty string:

      - uses: dorny/paths-filter@v3
        id: filter
        with:
          token: ''