aws-solutions / quota-monitor-for-aws

This solution leverages AWS Trusted Advisor and Service Quotas to monitor AWS resource usage and raise alerts.
Apache License 2.0
444 stars 116 forks source link

SystemError [ERR_SYSTEM_ERROR]: A system error occurred: uv_os_homedir returned EMFILE (too many open files)" #153

Closed kmohanarangam closed 11 months ago

kmohanarangam commented 1 year ago

Describe the bug Spoke account is not pushing the events to Hub account. It looks like Lambda is faling when it is sending the messages to EventBus.

To Reproduce

  1. Setup AWS organization with 3 accounts viz Management, Delegated_Admin, Workload.
  2. Install latest quota-monitor-prerequisite.template on Management account and delete admin to Delegated_Admin account
  3. Install latest quota-monitor-sq-spoke.template to Workload account.
  4. Provide 60% as alert threshold.
  5. Provide an email for SNS notification and confirm subscription.
  6. After the stack is setup, create 4 vpcs in workload account.

Expected behavior

An email with an alert that VPC have exceeded 60% of service quota (5 default)

Please complete the following information about the solution:

2023-08-23T16:01:29.630Z d2fcf924-47ca-40bd-a8c4-45cbbeda2640 ERROR Unhandled Promise Rejection { "errorType": "Runtime.UnhandledPromiseRejection", "errorMessage": "SystemError [ERR_SYSTEM_ERROR]: A system error occurred: uv_os_homedir returned EMFILE (too many open files)", "reason": { "errorType": "SystemError", "errorMessage": "A system error occurred: uv_os_homedir returned EMFILE (too many open files)", "code": "ERR_SYSTEM_ERROR", "info": { "errno": -24, "code": "EMFILE", "message": "too many open files", "syscall": "uv_os_homedir" }, "errno": -24, "syscall": "uv_os_homedir", "stack": [ "SystemError [ERR_SYSTEM_ERROR]: A system error occurred: uv_os_homedir returned EMFILE (too many open files)", " at getHomeDir (/opt/nodejs/node_modules/@smithy/shared-ini-file-loader/dist-cjs/getHomeDir.js:14:29)", " at getCredentialsFilepath (/opt/nodejs/node_modules/@smithy/shared-ini-file-loader/dist-cjs/getCredentialsFilepath.js:7:128)", " at loadSharedConfigFiles (/opt/nodejs/node_modules/@smithy/shared-ini-file-loader/dist-cjs/loadSharedConfigFiles.js:11:76)", " at /opt/nodejs/node_modules/@smithy/node-config-provider/dist-cjs/fromSharedConfigFiles.js:8:102", " at /opt/nodejs/node_modules/@smithy/property-provider/dist-cjs/chain.js:11:28", " at process.processTicksAndRejections (node:internal/process/task_queues:95:5)", " at async coalesceProvider (/opt/nodejs/node_modules/@smithy/property-provider/dist-cjs/memoize.js:14:24)", " at async /opt/nodejs/node_modules/@smithy/property-provider/dist-cjs/memoize.js:26:28" ] }, "promise": {}, "stack": [ "Runtime.UnhandledPromiseRejection: SystemError [ERR_SYSTEM_ERROR]: A system error occurred: uv_os_homedir returned EMFILE (too many open files)", " at process. (file:///var/runtime/index.mjs:1186:17)", " at process.emit (node:events:513:28)", " at emit (node:internal/process/promises:149:20)", " at processPromiseRejections (node:internal/process/promises:283:27)", " at process.processTicksAndRejections (node:internal/process/task_queues:96:32)" ] }

Screenshots If applicable, add screenshots to help explain your problem (please DO NOT include sensitive information).

Additional context Add any other context about the problem here.

abewub commented 1 year ago

Step 3 says - Install latest quota-monitor-sq-spoke.template to Workload account.

If you set up and organization and are using the default Organizations mode, you don't need to manually deploy the spoke stacks.

If you are using the hybrid mode in organizations or no orgs mode, check you have correctly entered the event bus arn and the correct account number in the the SSM parameter store parameter '/QuotaMonitor/Accounts' (https://docs.aws.amazon.com/solutions/latest/quota-monitor-for-aws/step-5.-launch-the-spoke-stacks-optional.html) .

I couldn't reproduce the error either in org mode or no org mode of deployment.

PS: aws service-quotas list-service-quotas --service-code vpc doesn't return any quota that can be monitored through usage metric in CloudWatch (i.e. of type UsageMetric.MetricNamespace = AWS/Usage). So even if you didn't have the above errors, this specific quota can't be monitored through service quotas. However, Trusted Advisor checks this limit (https://docs.aws.amazon.com/solutions/latest/quota-monitor-for-aws/architecture-overview.html)

kmohanarangam commented 1 year ago

I am using default Organization. I've deleted the stack from member account. From the guide, it reads, After you choose the deployment mode, the resources needed for that mode are provisioned. The deployment workflow is invoked when you update the deployed Systems Manager Parameter Store.. Is there anything I need to do to update the deployed Systems Manager Parameter Store?

abewub commented 1 year ago

Yes, enter the organization ID or the organizational unit ID (or a comma separated list of them) in the paramter /QuotaMonitor/OUs of the SSM Parameter Store. The spoke instances will be deployed (with the needed configurations) in all member accounts in a few minutes. The first Service Qoutas scan will be done after 6 or 12 hours (and every 6 or 12 hours) based on the selected parameters.