We've followed the docs to establish a spot data feed integration, but are facing a 403 error as displayed in on the /diagnostics page:
For context our setup is as follows:
CUR in the payer account
CUR bucket is replicated to a bucket in account A
Athena integration is in account A
EKS cluster is in account B and accesses data in account A via IRSA
---> CUR integration is healthy
Spot Data Feed is in account B and writes to a bucket in account A
---> Spot data feed integration yields 403 error
Helm config:
kubecostProductConfigs:
athenaBucketName: s3://redacted
athenaDatabase: athenacurcfn_kubecost
athenaProjectID: 'Account A ID'
athenaRegion: eu-central-1
athenaTable: kubecost
athenaWorkgroup: Kubecost
awsSpotDataBucket: redacted (no s3 prefix)
awsSpotDataPrefix: '' (there is no prefix configured for the spot data feed)
awsSpotDataRegion: eu-central-1
clusterName: redacted
masterPayerARN: 'redacted' (role in account A)
projectID: 'account B' (where the cluster lives)
kubecostToken: 'redacted'
prometheus:
nodeExporter:
enabled: false
serviceAccounts:
nodeExporter:
create: false
server:
resources:
limits:
memory: 4096Mi
requests:
cpu: 500m
memory: 2048Mi
serviceAccount:
create: false
name: kubecost
My understanding is that the spot data feed feature will be assuming the same role used by the CUR integration, as specified with the masterPayerARN property.
We have tested the relevant bucket and IAM policies and can confirm the role of the service account is able to use the ListObjects API on the spot data feed bucket via the AWS CLI.
Steps to reproduce
N/A - we don't know why this is happening. Configuration is outlined above.
Expected behavior
Perform the ListObjects API call and return a 200 status code as the masterPayerARN role has permission to do so.
Impact
High. Majority of our workloads run on spot nodes. Without this feature we will need to wait for the CUR reconciliation which is not a practical time frame for our business needs.
Screenshots
No response
Logs
WRN Skipping AWS spot data download: operation error S3: ListObjects, https response error StatusCode: 403, RequestID: 10XE9D9CFAYGT01P, HostID: redacted, api error AccessDenied: Access Denied
WRN got error 9 error(s) retrieving volumes: [operation error EC2: DescribeVolumes, failed to sign request: failed to retrieve credentials: failed to refresh cached credentials, failed to retrieve credentials, operation error STS: AssumeRoleWithWebIdentity, exceeded maximum number of attempts, 3, https response error StatusCode: 400, RequestID: 10443fba-78f6-49f3-bf28-a599bbd44b04, InvalidIdentityToken: No OpenIDConnect provider found in your account for https://oidc.eks.eu-central-1.amazonaws.com/id/redacted
ERR savings: cluster sizing: failed to get monthly cluster rates: error getting valid asset set in MonthlyNodeClusterRates: could not obtain latest valid asset set (an AssetSet where all Assets (i.e. Nodes) have NodeType != "" and TotalCost > 0.0
ERR error creating spot-ready workload distributions: error fetching monthly cluster rates: error getting valid asset set in MonthlyNodeClusterRates: could not obtain latest valid asset set (an AssetSet where all Assets (i.e. Nodes) have NodeType != "" and TotalCost > 0.0
Slack discussion
No response
Troubleshooting
[X] I have read and followed the issue guidelines and this is a bug impacting only the Kubecost application.
[X] I have searched other issues in this repository and mine is not recorded.
Kubecost Version
1.108.0
Kubernetes Version
v1.27.9
Kubernetes Platform
EKS
Description
Hello,
We've followed the docs to establish a spot data feed integration, but are facing a 403 error as displayed in on the /diagnostics page:
For context our setup is as follows:
CUR in the payer account
CUR bucket is replicated to a bucket in account A
Athena integration is in account A
EKS cluster is in account B and accesses data in account A via IRSA ---> CUR integration is healthy
Spot Data Feed is in account B and writes to a bucket in account A ---> Spot data feed integration yields 403 error
Helm config:
My understanding is that the spot data feed feature will be assuming the same role used by the CUR integration, as specified with the
masterPayerARN
property.We have tested the relevant bucket and IAM policies and can confirm the role of the service account is able to use the ListObjects API on the spot data feed bucket via the AWS CLI.
Steps to reproduce
N/A - we don't know why this is happening. Configuration is outlined above.
Expected behavior
Perform the ListObjects API call and return a 200 status code as the
masterPayerARN
role has permission to do so.Impact
High. Majority of our workloads run on spot nodes. Without this feature we will need to wait for the CUR reconciliation which is not a practical time frame for our business needs.
Screenshots
No response
Logs
Slack discussion
No response
Troubleshooting