crossplane-contrib / provider-ansible

Crossplane provider to execute Ansible contents remotely inside a Kubernetes cluster.
Apache License 2.0
59 stars 24 forks source link

AnsibleRun Support in Composition/XR #172

Open ride808 opened 1 year ago

ride808 commented 1 year ago

What happened?

I can create an AnsibleRun resource without issue and run an inline ansible playbook. However, I'm unable to add the same AnsibleRun resource as a part of a larger crossplane composition/XR. Should it be possible to use AnsibleRun within an XR?

How can we reproduce it?

Creating an AnsibleRun Resources like the following works without issue:

apiVersion: ansible.crossplane.io/v1alpha1
kind: AnsibleRun
metadata:
  name: ansible-example
spec:
  forProvider:
    playbookInline: |
      ---
      - hosts: localhost
        tasks:
          - name: ansibleplaybook-example
            debug:
              msg: Your are running 'ansibleplaybook-example' example
  providerConfigRef:
    name: provider-ansible

When adding the same resource to an XR like below, the other resources (EC2 and SecurityGroup) in the composition are created, but the ansiblerun resource is not created:

apiVersion: apiextensions.crossplane.io/v1
kind: Composition
metadata:
  name: admin-server
  labels:
    crossplane.io/xrd: xadmininstances.aws.hades.org
    provider: provider-aws
spec:
  writeConnectionSecretsToNamespace: crossplane-system
  compositeTypeRef:
    apiVersion: hades.org/v1alpha1
    kind: XAdminInstance
  resources:
  - name: securitygroup
    base:
      apiVersion:  ec2.aws.crossplane.io/v1beta1
      kind: SecurityGroup
      spec:
        forProvider:
          region: us-east-1
          vpcId: vpc-0186b862b83f5cd71
          description: Admin server for Environment
          ingress:
            - fromPort: 0
              toPort: 65535
              ipProtocol: tcp
              ipRanges:
                - cidrIp: 10.77.77.20/32
            - fromPort: 22
              toPort: 22
              ipProtocol: tcp
              ipRanges:
                - cidrIp: 10.77.77.10/32
            - fromPort: 22
              toPort: 22
              ipProtocol: tcp
              ipRanges:
                - cidrIp: 166.28.0.0/16
            - fromPort: 443
              toPort: 443
              ipProtocol: tcp
              ipRanges:
                - cidrIp: 0.0.0.0/32
            - fromPort: 8080
              toPort: 8084
              ipProtocol: tcp
              ipRanges:
                - cidrIp: 0.0.0.0/32
            - fromPort: 80
              toPort: 80
              ipProtocol: tcp
              ipRanges:
                - cidrIp: 0.0.0.0/0
        providerConfigRef:
          name: provider-aws
    patches:
      - type: FromCompositeFieldPath
        fromFieldPath: "metadata.name"
        toFieldPath: "spec.forProvider.groupName"
  - name: admin-instance
    base:
      apiVersion:  ec2.aws.crossplane.io/v1alpha1
      kind: Instance
      spec:
        forProvider:
          region: us-east-1
          imageId: ami-02ae903c0b1d9fd12
          instanceType: t3.medium
          keyName: hades-key
          blockDeviceMappings:
          - deviceName: /dev/sdx
            ebs:
              volumeType: gp3
          subnetId: subnet-08d3a539398176845
          securityGroupSelector:
            matchControllerRef: true
          tags:
            - key: Name
              value: somogyi-admin
        providerConfigRef:
          name: provider-aws
    patches:
      - type: FromCompositeFieldPath
        fromFieldPath: "spec.parameters.storageGB"
        toFieldPath: "spec.forProvider.blockDeviceMappings[0].ebs.volumeSize"
  - name: ansibleconfig
    base:
      apiVersion: ansible.crossplane.io/v1alpha1
      kind: AnsibleRun
      spec:
        forProvider:
          playbookInline: |
            ---
            - hosts: localhost
              tasks:
                - name: ansibleplaybook-example
                  debug:
                    msg: Hello world!
        providerConfigRef:
          name: provider-ansible

What environment did it happen in?

Crossplane version: 1.10.1 provider-ansible:v0.4.0
provider-aws:v0.33.0

AshleyDumaine commented 1 year ago

FWIW, I'm also using AnsibleRun in a Composition and after some troubleshooting (just started learning / using Crossplane this week), I was able to get AnsibleRun to work as long as I don't specify a name for this resource or any others in the Composition and patch in the Claim namespace to AnsibleRun:

<snip>
    - base:
        apiVersion: ansible.crossplane.io/v1alpha1
        kind: AnsibleRun
        spec:
          forProvider:
            vars:
              ansible_ssh_user: root
              ansible_ssh_private_key_file: ./ssh_id
              ansible_ssh_common_args: '-o StrictHostKeyChecking=no'
              rke2_download_kubeconf: True
              rke2_download_kubeconf_path: /tmp/
              rke2_cni: cilium
              rke2_token: <redacted>
            roles:
              - name: lablabs.rke2
                src: lablabs.rke2
          providerConfigRef:
            name: default
          writeConnectionSecretToRef:
            name: rke2-install
            namespace: upbound-system
      patches:
        - fromFieldPath: spec.claimRef.namespace
          toFieldPath: metadata.namespace
        - fromFieldPath: spec.claimRef.name
          toFieldPath: metadata.name
        - type: CombineFromComposite
          combine:
            variables:
              - fromFieldPath: status.master0IPAddress
              - fromFieldPath: status.master1IPAddress
              - fromFieldPath: status.master2IPAddress
            strategy: string
            string:
              fmt: '{"all":{"children":{"k8s_cluster":{"children":{"masters":{"hosts":{"%s":null,"%s":null,"%s":null}}}}}}}'
          toFieldPath: spec.forProvider.inventoryInline

I'm using the image built from the main branch.

ride808 commented 1 year ago

Thanks @AshleyDumaine. That got it working for me!! Really appreciate the response.

ron1 commented 1 year ago

I was able to get AnsibleRun to work as long as I don't specify a name for this resource or any others in the Composition and patch in the Claim namespace to AnsibleRun

@ride808 @fahedouch Do you agree that AnsibleRun should be usable within Compositions w/out the restrictions described by @AshleyDumaine above?

If so, should this issue be formally re-opened to request removal of the restrictions described above? Or, should this issue remain closed and the following issues be opened instead?

fahedouch commented 1 year ago

@ron1 would you please create an issue ticket with these informations and then reclose this one. Thks

ride808 commented 1 year ago

@AshleyDumaine @fahedouch One last question. In Ashley's snippet above the var: ansible_ssh_private_key_file: ./ssh_id is set in the forProvider section of the AnsibleRun Resource. I assume this tells the provider to use that private key when executing the role/playbook, etc in the provider pod. However, how do you get that private key of your choosing in the provider pod? I'm getting ssh unreachable errors and I'm guessing that it's because there is no ssh key in the provider container:

[ec2-user@ip-166-28-20-77 ~]$ kubectl exec -it provider-ansible-325ec633e2d4-67db7b66fc-5tr66 -n crossplane-system sh

$ ps auxx
PID   USER     TIME  COMMAND
    1 ansible  21:50 crossplane-ansible-provider
  593 ansible   1:02 [ansible-playboo]
 1283 ansible   0:15 {ansible-playboo} /usr/bin/python3 /usr/bin/ansible-playbook -e @/ansibleDir/19ecb472-e7bf-417f-80e0-9a2d6632da0e/env/extravars playbook.yml
 1344 ansible   0:00 {ansible-playboo} /usr/bin/python3 /usr/bin/ansible-playbook -e @/ansibleDir/19ecb472-e7bf-417f-80e0-9a2d6632da0e/env/extravars playbook.yml
 1492 ansible   0:00 sh
 1498 ansible   0:00 ps auxx

$ cat ansibleDir/19ecb472-e7bf-417f-80e0-9a2d6632da0e/env/extravars
{"ansible_provider_meta":{"somogyi-admin":{"state":"present"}},"ansible_ssh_common_args":"-o StrictHostKeyChecking=no","ansible_ssh_private_key_file":"./ssh_id","ansible_ssh_user":"root"}/ $

It seems I'm definitely misunderstanding something here and how a provider is configured to allow connections to your provisioned managed resources. Where should ./ssh_id be coming from? Please let me know if you'd like me to open this as a question/new issue.

ron1 commented 1 year ago

@ride808 See this comment which might help.

AshleyDumaine commented 1 year ago

@ride808 I was using a ProviderConfig with a Secret holding the private key. This ProviderConfig is referenced in my snippet's providerConfigRef.

Example (that puts the private key at ./ssh_id):

apiVersion: ansible.crossplane.io/v1alpha1
kind: ProviderConfig
metadata:
  name: default
  namespace: upbound-system
spec:
  credentials:
    - filename: ssh_id
      source: Secret
      secretRef:
        key: <key-name>
        name: <secret-name>
        namespace: upbound-system
fahedouch commented 1 year ago

we also support inventory if this can help

ride808 commented 1 year ago

@AshleyDumaine @fahedouch How do you manage dependencies. In my composition, my ansiblerun resource executes against the ec2 instance I provisioned in the same composition with an inline inventory. But ansiblerun is created in parallel with the instance resource so I get Unreachable errors killing my playbook. After the instance goes to a running state I can delete the anislberun resource in my cluster and let it re-create and then the playbook executes and completes. I've been unsuccessful at using wait_for_connection in the playbook as it just hangs and never exits. Did you encounter this? Any tips? Also - the ansiblerun resource didn't seem to try again after the failed playbook (should it be?). The only way I could get the playbook to rerun was by killing the ansiblerun resource and letting crossplane re-create it.

AshleyDumaine commented 1 year ago

@ride808 I believe this issue is relevant: https://github.com/crossplane/crossplane/issues/2072

Specifically for AnsibleRun operating against newly provisioned Crossplane-managed VMs in the Composition, I've had to add the following to get it to work more reliably:

ansible_ssh_common_args: '-o StrictHostKeyChecking=no -o ConnectionAttempts=10 -o ConnectTimeout=60'
ride808 commented 1 year ago

Hmm. That didn't seem to do the trick either. I can see the options getting passed in to the ssh call:

  712 ansible   0:00 {ansible-runner} /usr/bin/python3.8 /usr/bin/ansible-runner run ansibleDir/71cdbead-151d-458f-b5be-c4d8c0562d6c -p playbook.yml
  714 ansible   0:05 {ansible-playboo} /usr/bin/python3 /usr/bin/ansible-playbook -e @/ansibleDir/71cdbead-151d-458f-b5be-c4d8c0562d6c/env/extravars playbook.yml
  723 ansible   0:00 ssh -C -o ControlMaster=auto -o ControlPersist=60s -o StrictHostKeyChecking=no -o IdentityFile="./sshkey" -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o User="ec2-user" -o ConnectTimeout=10 -o StrictHostKeyChecking=no -o ConnectionAttempts=10 -o ConnectTimeout=60 -o ControlPath=/home/ansible/.ansible/cp/144eb13078 166.28.17.129 /bin/sh -c 'echo ~ec2-user && sleep 0'

But after 30 seconds the ansible-playbook and ansible-runner processes in the provider pod just stop and my playbook never finishes. Seems to be the provider prematurely killing the playbook.

Could it be that this isn't yet in a release and is killing my ansible playbook too quickly? https://github.com/crossplane-contrib/provider-ansible/pull/177/

fahedouch commented 1 year ago

@ride808 fixed by https://github.com/crossplane-contrib/provider-ansible/pull/177, would you please retry with the main branch docker image here

ron1 commented 1 year ago

@fahedouch It seens a new 0.4.1 release that contains all the fixes/improvements sitting on the main branch would be very welcome here.

fahedouch commented 1 year ago

@fahedouch It seens a new 0.4.1 release that contains all the fixes/improvements sitting on the main branch would be very welcome here.

I am planning to release the v0.4.1 by the end of the week.

ron1 commented 1 year ago

@ron1 would you please create an issue ticket with these informations and then reclose this one. Thks

Unfortunately I don't have an environment at the moment in which to reproduce these issues. @ride808 would you consider creating two new replacement issues and closing this one?

ride808 commented 1 year ago

@fahedouch @ron1 the main image containing #177 did fix my issue. Is there a way to set that timeout? I didn't see any docs with the pull request on how to configure the provider and can see my playbooks taking longer than the default 20m.

ride808 commented 1 year ago

@ron1 would you please create an issue ticket with these informations and then reclose this one. Thks

Unfortunately I don't have an environment at the moment in which to reproduce these issues. @ride808 would you consider creating two new replacement issues and closing this one?

@ron1 I'll try to create those two issues against the project today and will close this one when I do.

fahedouch commented 1 year ago

@ride808

@fahedouch @ron1 the main image containing #177 did fix my issue. Is there a way to set that timeout? I didn't see any docs with the pull request on how to configure the provider and can see my playbooks taking longer than the default 20m.

to override the default timeout or other flags (e.g poll, ansible-collections-path etc..) , you have to setup a ControllerConfig resource with new timeout value (e.g args: ["--timeout","50m"]. Then reference your ControllerConfig resource into your Provider resource using (controllerConfigRef).

not sure if this controllerConfigRef is dynamically detected for existing Provider resource. May be you have to redeploy the provider to take effect.

Maybe we should add a FAQ to address these kind of questions!

@ron1 I'll try to create those two issues against the project today and will close this one when I do.

thanks

adamhouse commented 1 year ago

Any more info on the namespace requirement? Same as @ride808 and @AshleyDumaine, I'm unable to compose an AnsibleRun resource unless I patch in a namespace. Is this intentional? I don't recall having to explicitly set the namespace for other providers (like AWS and Terraform).

janwillies commented 9 months ago

I can confirm @AshleyDumaine observations that an AnsibleRun in a composition only works when:

  1. the base name is not specified
  2. metadata.name is specified
  3. metadata.namespace is specified

I really wonder how you figured the first one out. This is such a weird issue.

If you don't follow the workaround, the error message is:

cannot generate a name for composed resource "ansible-run": an empty namespace may not be set when a resource name is provided

These workarounds only work in legacy p&t compositions. When trying composition functions, it will result in:

cannot compose resources: cannot update composite resource spec.resourceRefs: failed to create typed patch object (/example-run-rxm4q; crossplane.accenture.com/v1alpha1, Kind=XAnsible): .spec.resourceRefs[0].namespace: field not declared in schema''

alejandro-ripoll commented 5 months ago

Still facing this issue when using cluster-scoped AnsibleRun within a pipeline composition after upgrading to newly released v0.6.0.

---
apiVersion: apiextensions.crossplane.io/v1
kind: CompositeResourceDefinition
metadata:
  name: ansibletests.custom-api.example.org
spec:
  group: custom-api.example.org
  names:
    kind: AnsibleTest
    plural: ansibletests
  versions:
  - name: v1alpha1
    served: true
    referenceable: true
    schema:
      openAPIV3Schema: {}
---
apiVersion: apiextensions.crossplane.io/v1
kind: Composition
metadata:
  name: ansibletest
spec:
  compositeTypeRef:
    apiVersion: custom-api.example.org/v1alpha1
    kind: AnsibleTest
  mode: Pipeline
  pipeline:
  - step: run-ansible
    functionRef:
      name: function-go-templating
    input:
      apiVersion: gotemplating.fn.crossplane.io/v1beta1
      kind: GoTemplate
      source: Inline
      inline:
        template: |
          apiVersion: ansible.crossplane.io/v1alpha1
          kind: AnsibleRun
          metadata:
            annotations:
              {{ setResourceNameAnnotation "run-ansible" }}
          spec:
            forProvider:
              playbookInline: |
                ---
                - hosts: localhost
                  tasks:
                  - name: ansibletest
                    debug:
                      msg: This Is A Test

  - step: automatically-detect-ready-composed-resources
    functionRef:
      name: function-auto-ready
---
apiVersion: custom-api.example.org/v1alpha1
kind: AnsibleTest
metadata:
  name: my-test
spec: {}

Errors:

Events:
  Type     Reason             Age                 From                                                             Message
  ----     ------             ----                ----                                                             -------
  Normal   SelectComposition  108s                defined/compositeresourcedefinition.apiextensions.crossplane.io  Successfully selected composition: ansibletest
  Normal   SelectComposition  108s                defined/compositeresourcedefinition.apiextensions.crossplane.io  Selected composition revision: ansibletest-9289b63
  Warning  ComposeResources   46s (x7 over 108s)  defined/compositeresourcedefinition.apiextensions.crossplane.io  cannot compose resources: cannot generate a name for composed resource "run-ansible": an empty namespace may not be set when a resource name is provided

Is this expected? Is there any known workaround for this?

Thanks

janwillies commented 5 months ago

Is the CRD cluster- or namespace-scoped in you Kubernetes cluster?

alejandro-ripoll commented 5 months ago

Hi @janwillies, I'm using the new cluster-scoped that was introduced on v0.6.0.

NAME                                    SHORTNAMES          APIVERSION                                          NAMESPACED   KIND
ansibleruns                                                 ansible.crossplane.io/v1alpha1                      false        AnsibleRun