sassoftware / viya4-iac-aws

This project contains Terraform configuration files to provision infrastructure components required to deploy SAS Viya platform products products on Amazon AWS.
Apache License 2.0
37 stars 43 forks source link

Fix autoscaling policy to correctly handle eks:DescribeNodegroup permission #292

Closed bkoprivica closed 1 week ago

bkoprivica commented 3 months ago

Summary

This PR fixes the IAM policy for the Cluster Autoscaler to correctly handle the eks:DescribeNodegroup permission.

Rationale

The Cluster Autoscaler encounters the following error, indicating an AccessDeniedException for "eks:DescribeNodegroup":

E0604 20:35:15.324713 1 aws_manager.go:308] Failed to get labels from EKS DescribeNodegroup API for nodegroup cas-202401... in cluster viya-... because AccessDeniedException: User: arn:aws:sts::7... /viya-...-cluster-autoscaler/17... is not authorized to perform eks:DescribeNodegroup on resource: arn:aws:eks:ca-central-1:7...:nodegroup/viya.../cas-202401...-dea0-52....

The condition in the existing policy applies to Auto Scaling Groups, but eks:DescribeNodegroup operates on EKS-managed node groups. IAM permissions might be checking for these tags on the node group resource, not just the underlying ASG. As a result, tags on ASGs might not propagate or apply in the way it is expected.

dhoucgitter commented 3 months ago

@bkoprivica, thanks for submitting this PR and raising this as an issue.

sbralg commented 1 week ago

I was able to replicate the error messages reported in this PR.

It seems the suggested IAM Policy documented in the autoscaler project was updated 8 months ago. The previous example looks similar to what is currently being created by the IaC project, which causes the error message.

As such, I believe the policy created by the IaC project should be updated to reflect the current policy requirement as documented in the autoscale project.

I was able to provision an environment using the following file, which includes the conditions also used by the IaC and doesn't cause the errors that were reported: viya4-iac-aws.zip

dhoucgitter commented 1 week ago

Hi @bkoprivica, @sbralg, we appreciate your efforts to inform us about the required cluster-autoscaler policy change needed to eliminate the unwanted error in the cluster-autoscaler pod log.

You can expect the update to be present in the upcoming release of viya4-iac-aws through #302

bkoprivica commented 1 week ago

Subject: Request to Reopen PR #292 for Proper Review and Attribution

Hi @maintainers,

I am writing to formally request the reopening of PR #292, which was closed prematurely without merging the fix I provided. The purpose of PR #292 was to resolve the IAM policy issue for the autoscaler, and I claim that it provided the core structure and logic of the fix that is now being incorporated in PR #302.

Here are the reasons why I believe this request is necessary:

Failure to Attribute Properly:
    While both myself and @sbralg were acknowledged, the lines of attribution have been blurred. It is unclear who is receiving credit for the core solution, even though I originally submitted the fix three months ago in PR #292.

Closed PR with Unmerged Fix:
    My PR was closed without merging, even though it contained a working solution that directly addressed the IAM policy issue. The solution in PR #302 reiterates the same fix, essentially restating what was provided in my original PR. This seems to sidestep proper attribution and may result in confusion regarding the origin of the fix.

Possible Copyright Infringement:
    The fix in PR #302 appears to incorporate the core structure and logic of my original solution. If PR #302 is using my work without proper attribution, this could be a violation of copyright and the terms of the Apache 2.0 license.

Attempt to Divert Credit:
    The situation surrounding PR #302 and the way my PR was closed gives the appearance of an attempt to divert credit for the original solution I provided in PR #292. This is especially concerning given that my fix addressed a significant issue with SAS Viya autoscaling, which has substantial commercial and technical implications.

Given the above points, I request that you reopen PR #292 for proper review and to ensure that appropriate attribution is given to my contribution.

Thank you for your time and attention to this matter.

Best regards, @bkoprivica