awslabs / crossplane-on-eks

Crossplane bespoke composition blueprints for AWS resources
Apache License 2.0
308 stars 109 forks source link

EMR On EKS jobs composition is fails with IAMPolicy empty namespace error #107

Closed gastoncan closed 1 year ago

gastoncan commented 1 year ago

Please describe your question here

I am trying the emr-on-eks composition example from the crossplane-on-eks library. My EMRContainer (for EMR job-run) resource status is synced but not ready.

The question is, why is my EMR job resource not being created?

Taking a look a the XEMRContainer resource, I see the following events, describing 2 kinds of different errors:

cannot use dry-run create to name composed resource: an empty namespace may not be set during creation cannot apply the patch at index 9: status: no such field

Events: Type Reason Age From Message


Normal SelectComposition 5m6s (x3 over 5m6s) defined/compositeresourcedefinition.apiextensions.crossplane.io Successfully selected composition Warning ComposeResources 5m6s defined/compositeresourcedefinition.apiextensions.crossplane.io composed resource "irsa-role-only": cannot apply the patch at index 1: status: no such field Warning ComposeResources 5m6s (x2 over 5m6s) defined/compositeresourcedefinition.apiextensions.crossplane.io composed resource "read-policy-entrypoint": cannot use dry-run create to name composed resource: an empty namespace may not be set during creation Warning ComposeResources 5m6s (x2 over 5m6s) defined/compositeresourcedefinition.apiextensions.crossplane.io composed resource "read-policy": cannot use dry-run create to name composed resource: an empty namespace may not be set during creation Warning ComposeResources 5m6s (x2 over 5m6s) defined/compositeresourcedefinition.apiextensions.crossplane.io composed resource "write-policy": cannot use dry-run create to name composed resource: an empty namespace may not be set during creation Warning ComposeResources 5m6s defined/compositeresourcedefinition.apiextensions.crossplane.io composed resource "job-run": cannot apply the patch at index 9: status: no such field Normal ComposeResources 5m6s (x2 over 5m6s) defined/compositeresourcedefinition.apiextensions.crossplane.io Composed resource "irsa-role-only" is not yet ready Normal ComposeResources 5m6s (x2 over 5m6s) defined/compositeresourcedefinition.apiextensions.crossplane.io Composed resource "read-policy-entrypoint" is not yet ready Normal ComposeResources 5m6s (x2 over 5m6s) defined/compositeresourcedefinition.apiextensions.crossplane.io Composed resource "read-policy" is not yet ready Normal ComposeResources 5m6s (x2 over 5m6s) defined/compositeresourcedefinition.apiextensions.crossplane.io Composed resource "write-policy" is not yet ready Normal ComposeResources 5m6s (x2 over 5m6s) defined/compositeresourcedefinition.apiextensions.crossplane.io Composed resource "s3-bucket" is not yet ready Normal ComposeResources 5m6s (x2 over 5m6s) defined/compositeresourcedefinition.apiextensions.crossplane.io Composed resource "job-run" is not yet ready Warning ComposeResources 5m6s defined/compositeresourcedefinition.apiextensions.crossplane.io composed resource "irsa-role-only": cannot apply the patch at index 1: status.accountId: no such field Warning ComposeResources 5m6s defined/compositeresourcedefinition.apiextensions.crossplane.io composed resource "job-run": cannot apply the patch at index 9: status.roleArn: no such field

I created the EMRContainer for virtual cluster successfully, but the job-run resource fail with the above errors.

This is the XR for the jobs creation:

apiVersion: awsblueprints.io/v1alpha1
kind: EMRContainer
metadata:
  name: test-job-run
  namespace: dev
spec:
  compositionSelector:
    matchLabels:
      awsblueprints.io/environment: dev
      awsblueprints.io/type: job-run
  resourceConfig:
    providerConfigName: aws-provider
    region: eu-central-1
  eksOIDC: oidc.eks.eu-central-1.amazonaws.com/id/XXXXXXXXXXXXXX # Changed with suitable OIDC
  jobParams:
    sparkEntryPoint: s3://us-west-2.elasticmapreduce/emr-containers/samples/wordcount/scripts/wordcount.py
    sparkSubmitParameters: "--conf spark.executor.instances=2 --conf spark.executor.memory=1G --conf spark.executor.cores=1 --conf spark.driver.cores=1"
    virtualClusterId: "XXXXXXXXXXXXX" # Changed with suitable virtual cluster id

Provide link to the example related to the question

https://github.com/awslabs/crossplane-on-eks/blob/main/examples/aws-provider/composite-resources/emr-on-eks/job-run.yaml

Additional context

I installed crossplane latest with AWS provider package xpkg.upbound.io/crossplane-contrib/provider-aws:v0.37.1.

Is that possible that the latests commits in the compositions are affecting the stability of the examples? https://github.com/awslabs/crossplane-on-eks/commit/9719c583232b5da4373ab98ee0dabfb0a2a8e090

More

nabuskey commented 1 year ago

Thanks for opening this issue. I think this is a bug introduced with the naming standardization where we decided to name cluster scoped IAMPolicy to XIAMPolicy and namespaced scoped to just IAMPolicy. Because of that, we are trying to create claims, not composite resources.

If you change reference to IAMPolicy to XIAMPolicy in the composition, it should get created correctly. That is, in this composition, replace every occurrence of IAMPolicy to XIAMPolicy.

I will have to fix this.

gastoncan commented 1 year ago

Hi @nabuskey . Thanks for the fast feedback, really appreciate your inputs. The namespace warning are gone, yey!

  1. Note that I added also X to IAMPolicy in the iam-policy/s3-read.yaml and iam-policy/s3-write.yaml files:
    • s3-read.yaml name: read-s3.iampolicy.awsblueprints.io --> name: read-s3.xiampolicy.awsblueprints.io and kind: IAMPolicy --> kind: XIAMPolicy
  1. I continue trying to run my jobs and then found an additional issue related to IAM Permission Boundary I found the following error under the API Version: iam.aws.crossplane.io/v1beta1 kind: Role resource:

    create failed: failed to create the Role resource: api error ValidationError: 1 validation error detected: Value '''' at ''permissionsBoundary'' failed to satisfy constraint: Member must have length greater than or equal to 20.

To workaround it I just commented the patch:

- type: FromCompositeFieldPath
   fromFieldPath: spec.permissionsBoundaryArn
   toFieldPath: spec.forProvider.permissionsBoundary

From the example EMRContainer looks like the permissionsBoundaryArn is optional, but I did not manage to run the example without it.

  1. Additional note about the jobParams: Note that the region provided in the jobParams.sparkEntryPoint must be the same as the region where you are running the example, otherwise my pods failed with "S3 list permission denied" error. It worth to add a short note on the example to avoid people struggling with it :)

Again, thank you very much for your support and great job!

nabuskey commented 1 year ago

@gastoncan Thank you very much for sharing detailed explanation on what you did to fix it. I've opened a PR to fix it.