operator-framework / operator-sdk

SDK for building Kubernetes applications. Provides high level APIs, useful abstractions, and project scaffolding.
https://sdk.operatorframework.io
Apache License 2.0
7.22k stars 1.75k forks source link

Ansible variable evaluation type mismatch #2609

Closed Nik-Novak closed 4 years ago

Nik-Novak commented 4 years ago

Bug Report

What did you do? A clear and concise description of the steps you took (or insert a code snippet).

CR spec: spec: workshopID: 30001

tasks/main.yaml: `- debug: msg: "{{ workshop_id | type_debug }}"

These statements produced: `TASK [cleanup : debug] ***** task path: /opt/ansible/roles/cleanup/tasks/main.yml:17 ok: [localhost] => { "msg": "int" }

TASK [cleanup : debug] ***** task path: /opt/ansible/roles/cleanup/tasks/main.yml:19 ok: [localhost] => { "msg": "int" }`

So it's evaluating as int, but then causes an error upon use in port field: `ports:

The error: TASK [resources : Create a nodeport service - Workshop's Resources] ********** task path: /opt/ansible/roles/resources/tasks/main.yml:11 fatal: [localhost]: FAILED! => {"changed": false, "error": 400, "msg": "Failed to create object: b'{\"kind\":\"Status\",\"apiVersion\":\"v1\",\"metadata\":{},\"status\":\"Failure\",\"message\":\"Service in version \\\\\"v1\\\\\" cannot be handled as a Service: v1.Service.Spec: v1.ServiceSpec.Ports: []v1.ServicePort: v1.ServicePort.NodePort: readUint32: unexpected character: \\\\ufffd, error found in #10 byte of ...|odePort\\\\\":\\\\\"30001\\\\\",\\\\\"po|..., bigger context ...|3b-5bc6dba6cc92\\\\\"}]},\\\\\"spec\\\\\":{\\\\\"ports\\\\\":[{\\\\\"nodePort\\\\\":\\\\\"30001\\\\\",\\\\\"port\\\\\":80,\\\\\"protocol\\\\\":\\\\\"TCP\\\\\",\\\\\"targetPort\\\\\":80}|...\",\"reason\":\"BadRequest\",\"code\":400}\\n'", "reason": "Bad Request", "status": 400}

I also tried casting to int with "{{workshop_id | int}}", but received the same error: TASK [resources : Create a nodeport service - Workshop's Resources] ********** task path: /opt/ansible/roles/resources/tasks/main.yml:11 fatal: [localhost]: FAILED! => {"changed": false, "error": 400, "msg": "Failed to create object: b'{\"kind\":\"Status\",\"apiVersion\":\"v1\",\"metadata\":{},\"status\":\"Failure\",\"message\":\"Service in version \\\\\"v1\\\\\" cannot be handled as a Service: v1.Service.Spec: v1.ServiceSpec.Ports: []v1.ServicePort: v1.ServicePort.NodePort: readUint32: unexpected character: \\\\ufffd, error found in #10 byte of ...|odePort\\\\\":\\\\\"30001\\\\\",\\\\\"po|..., bigger context ...|26-a1189a80de41\\\\\"}]},\\\\\"spec\\\\\":{\\\\\"ports\\\\\":[{\\\\\"nodePort\\\\\":\\\\\"30001\\\\\",\\\\\"port\\\\\":80,\\\\\"protocol\\\\\":\\\\\"TCP\\\\\",\\\\\"targetPort\\\\\":80}|...\",\"reason\":\"BadRequest\",\"code\":400}\\n'", "reason": "Bad Request", "status": 400}

What did you expect to see? A clear and concise description of what you expected to happen (or insert a code snippet). Expected to see a service created, and the integer value specified passing Resource Creation Validation. Jinja and ansible variables re supposed to preserve their native definition type now, as of this: https://github.com/pallets/jinja/pull/708

What did you see instead? Under which circumstances? A clear and concise description of what you expected to happen (or insert a code snippet). Failed builds due to this error: TASK [resources : Create a nodeport service - Workshop's Resources] ********** task path: /opt/ansible/roles/resources/tasks/main.yml:11 fatal: [localhost]: FAILED! => {"changed": false, "error": 400, "msg": "Failed to create object: b'{\"kind\":\"Status\",\"apiVersion\":\"v1\",\"metadata\":{},\"status\":\"Failure\",\"message\":\"Service in version \\\\\"v1\\\\\" cannot be handled as a Service: v1.Service.Spec: v1.ServiceSpec.Ports: []v1.ServicePort: v1.ServicePort.NodePort: readUint32: unexpected character: \\\\ufffd, error found in #10 byte of ...|odePort\\\\\":\\\\\"30001\\\\\",\\\\\"po|..., bigger context ...|26-a1189a80de41\\\\\"}]},\\\\\"spec\\\\\":{\\\\\"ports\\\\\":[{\\\\\"nodePort\\\\\":\\\\\"30001\\\\\",\\\\\"port\\\\\":80,\\\\\"protocol\\\\\":\\\\\"TCP\\\\\",\\\\\"targetPort\\\\\":80}|...\",\"reason\":\"BadRequest\",\"code\":400}\\n'", "reason": "Bad Request", "status": 400}

Environment

Possible Solution

Tried casting to int with "int" filter. Tried looking for configuration files to enable jinja NativeTypes. Hard-coding any value works fine.

Additional context These sources show that significant effort was put into attempting to solve this exact issue, yet the solutions dont seem to work for me. https://github.com/pallets/jinja/pull/708 https://jinja.palletsprojects.com/en/2.10.x/nativetypes/

Also, here's my CRD validation spec for Workshop: `validation: openAPIV3Schema: type: object properties: spec: type: object required:

Nik-Novak commented 4 years ago

Other people are also able to reproduce this

camilamacedo86 commented 4 years ago

I think it is related to how are you creating the resource. See an example which works in GO;

                    Ports: []corev1.ServicePort{
                {
                    Name: db.Name,
                    TargetPort: intstr.IntOrString{
                        Type:   intstr.Int,
                        IntVal: db.Spec.DatabasePort,
                    },
                    Port:     db.Spec.DatabasePort,
                    Protocol: "TCP",
                },
            },

You need to define the TargetPort.Type.

If it does not help you solve the issue. Could you please share the code which will create your Create a nodeport service?

Nik-Novak commented 4 years ago

I have a hunch the issue is with Jinja returning a Python-native integer type which is not translating somehow at the kubernetes API level.

Regardless here's an operator with all of the relevant code: https://drive.google.com/file/d/1cYlu1ty-zn7SNv6-QOqTNiJN05W8l5Mn/view?usp=sharing

See roles/resources/templates/svc-gmeclient.yaml.j2 for relevant problem

I've made an example using another project, since I could not share the Workshop code. Confirmed that this also has the same issue.

Will update when I try to define port types

camilamacedo86 commented 4 years ago

HI @Nik-Novak,

Thank you for the data provided. But it is a big project which we have not knowledge about. Note that it shows that you are not creating the resource correctly. IHMO shows that you are missing to define the TargetPort.Type.

So, could you please add here the snippet code that you are using to create the resource and/or share a POC based in a new project done from the scratch or the steps required for we reproduce it?

Nik-Novak commented 4 years ago

I'm putting together a bare-bones operator with a script to recreate the issue. Will post in about 15 minutes

Nik-Novak commented 4 years ago

We figured out the issue here: https://kubernetes.slack.com/archives/CAW0GV7A5/p1582893529107700

it has to do with ansible vs jinja being used to source templated variables. Thanks to @fabianvf

It's not necessarily a bug, but a very misleading implementation of Jinja and Ansible variable sourcing

Rylon commented 3 years ago

@Nik-Novak any chance you could share the discussion from that Slack link please? I don't have access to it, nor does it seem that I can sign up for access.

I've got this same problem, and not sure if it's possible to work around it.

jagibson commented 2 years ago

@Nik-Novak seconding @Rylon - can you share? Unable to view the kube slack archive.

jagibson commented 2 years ago

To save future people time-

Setting jinja2_native = True in ansbile.cfg or running ansible like ANSIBLE_JINJA2_NATIVE=True ansible-playbook... works.

Trying to set:

  - mytask:
    environment:
      ANSIBLE_JINJA2_NATIVE: True

did not.

Reference - https://github.com/operator-framework/operator-sdk/issues/1701

This issue is extremely opaque and I spent a lot of time finding the solution. Maybe the docs could be better?