humanmade / linter-bot

Automatically run the HM Coding Standards on any repository.
https://github.com/apps/hm-linter
16 stars 3 forks source link

Bot failing with an error Could not run: Error: EMFILE: too many open files, open #144

Open Nikschavan opened 4 years ago

Nikschavan commented 4 years ago

Example error message -

Could not run: Error: EMFILE: too many open files, open '/tmp/repos/<path-to-project-file>'

A few request details for the failed events -

GitHub Event ID: e7cbb380-198a-11eb-8a5f-d35bd8318f74
API Gateway ID: 0ca6a609-fca4-4a1e-b8b8-a203b28eae48
Lambda ID: cbdcf4a0-b436-48f0-a988-fcbd3edcc33b
Log Stream: 2020/10/29/[$LATEST]a8772e2e0a5540499c07a2ecd124b1cd
GitHub Event ID: bf977280-1900-11eb-80a4-1cf41f98d348
API Gateway ID: 2dd7582d-fda8-4d63-8617-2716c8e72259
Lambda ID: e3a8ac34-1484-49f3-933a-8438f96d6cee
Log Stream: 2020/10/28/[$LATEST]84c118eac9f44c1e982f993ec1c3a0f1
GitHub Event ID: e1533100-19b6-11eb-978c-2e43580e319b
API Gateway ID: 359265e1-62bb-4e0e-8f59-73df7984d1cb
Lambda ID: a653b867-e36c-4c57-b05a-64cee2a11270
Log Stream: 2020/10/29/[$LATEST]d22dab4f77984e51a486adf41f059328
rmccue commented 4 years ago

Could be that the project is too big 😬

ntwb commented 4 years ago

Heard it might be from some node_modules paths, would it be possible to ignore any instances of this path?

rmccue commented 4 years ago

We download the repo as a zip, so would only affect you if you had node_modules committed into the repo; is that the case?

Nikschavan commented 4 years ago

node_modules is not committed to the project, It may just be that the project has too many files.

Is there any solution to this problem?

ntwb commented 4 years ago

git ls-files | wc -l results in 1904 files

I download the repo zip and extracting reveals 2395 files

Nikschavan commented 3 years ago

One question I have is - This does pass on the next run if I select "Re-run", The error is intermittent. If it's related only because of the project being large then it should fail in all cases?

Are there any other ways we can improve to fix it?

ntwb commented 3 years ago

This does pass on the next run if I select "Re-run"

I've just restarted the linter-bot ~15 times on a single PR and it fails to complete with that same error

ntwb commented 3 years ago

After somewhere between 30 & 60 triggers of re-run for 3 PRs I've not been able to get the bot to pass these PRs, as such we're now overriding the linter-bot when necessary to merge these PRs

tareiking commented 3 years ago

The failures seem inconsistent in their cause. As mentioned if project file count was higher than the limit, we would have HM-Linter fail on all instances.

I don't know much about Lambda (or maybe a related cache) but does it persist data across instances/runs, which could lead to the count being higher as previous runs still have files laying about?

Is it possible to get the output of the log events mentioned in the description and dig further?

tfrommen commented 3 years ago

@rmccue it seems HM Linter is using the native fs, right? Did anyone try and use something like graceful-fs? Should be fairly quick to run a test—if you have it set up locally, already—but I don't have to the time right now to set up local testing for HM Linter... 😶

ntwb commented 3 years ago

Copying in the related log file entry from cloudwatch:

2021-01-07T17:43:17.455Z 2cbbacc1-10d9-4839-a963-eb39416bbc0a INFO [Error: EMFILE: too many open files, open '/tmp/repos/humanmade-repo-name-87cca255fdc7aef56a6c399031b33b6fef5a3e89/content/themes/theme-name/page.php'] { errno: -24, code: 'EMFILE', syscall: 'open', path: '/tmp/repos/humanmade-repo-name-87cca255fdc7aef56a6c399031b33b6fef5a3e89/content/themes/theme-name/page.php'}

rmccue commented 3 years ago

@rmccue it seems HM Linter is using the native fs, right? Did anyone try and use something like graceful-fs?

It doesn't, but I'm not sure if it'll fix it. graceful-fs' approach is to wait to retry opening when it receives an EMFILE, but not all files are opened by Node, so I suspect this might just end up waiting forever.

We can try it, but I wouldn't be betting on it personally.

ntwb commented 3 years ago

Yeah, I quickly whipped up #148 to try graceful-fs, I've not tried testing locally yet, the instructions for testing locally here seem straight forward but replicating this issue locally I think will be difficult.

tfrommen commented 3 years ago

@ntwb des that mean that running HM Linter locally on your project does not run into this issue? 🤔

ntwb commented 3 years ago

@tfrommen Correct, I've not been able to reproduce the issue locally, and it's only intermittent on the project

rmccue commented 3 years ago

Your ulimit -n is likely different locally to the live Lambda environment.

tareiking commented 3 years ago

Naive suggestion incoming: Is it possible to only pass the files within the PR to be linted in an attempt to reduce the file count?

rmccue commented 3 years ago

@tareiking It might be; see https://github.com/humanmade/linter-bot/issues/149 - not explicitly a goal of that though.

ntwb commented 3 years ago

This appears to be stylelint crashing in hm-linter rather than that specific error

To workaround the issue, set stylelint to enabled: false in the .github/hmlinter.yml file

stylelint:
    enabled: false
    version: inherit

So, I’m leaning toward this is the root cause of this issue as I've not seen this error occur since disabling stylelint, the stylelint issue may be an issue in hmlinter code base, or upstream in stylelint, I've not yet taken a closer look to debug this further