Open ansvu opened 4 months ago
Hi @ansvu what version of operator-sdk is being used? There were many issues in the 1.34
series of both operator-sdk and of the ansilbe plugin. I'd considering anything not latest
of both, to have potential issues, can they test with 1.35.0 of operator-sdk? This contains 1.34.3 of the ansible plugin.
Thanks @acornett21 for your info. They used the ansible-operator version from the community. So you meant this version quay.io/operator-framework/ansible-operator:v1.34.3?
@ansvu Are they just updating the image
and not updating the version of the binary
needed to scaffold/build a project?
@acornett21 This CNF is a little bit special, they combined between helm chart and ansible-operator and they used ansible-operator version straight from here quay.io/operator-framework/ansible-operator. No OLM integrated. It designs and architects not only for OCP but also other Kubernetes cluster as well.
@ansvu I understand, but if they have to have some yaml
manifests that go along with the ansible operator. So wouldn't they be using the operator-sdk
to build/bunlde/etc those manifests? If so those versions should be in-sync.
Hi @acornett21 as I know that they don't use operator-sdk
to build the ansible-operator image (bundle/etc) but using the ansible-operator image from this link quay.io/operator-framework/ansible-operator. What or how to maintain/modify the build/bundle/manfitests, this question has been asked to them. We just noticed this version 1.35.0
just built quay.io/operator-framework/ansible-operator:v1.35.0
2 hours ago. Can they try to test this version 1.35.0
? Thanks.
Hi @acornett21, they used this version v1.35.0
to test with following condition(apiservice)
kubectl get apiservice | grep False
v1alpha1.example.com try/api False (ServiceNotFound) 27m
The result has same error as in version v1.34.0
2024-07-24 15:29:13,913 p=3470 u=ansible n=ansible | TASK [cnf_status : Store CNF status and data] **********************************
2024-07-24 15:29:13,913 p=3470 u=ansible n=ansible | fatal: [localhost]: FAILED! => {"changed": false, "msg": "Failed to create object: b'Unable to determine if virtual resource\\n'", "reason": "Internal Server Error"}
2024-07-24 15:29:13,914 p=3470 u=ansible n=ansible | PLAY RECAP
Hi @acornett21, has any suggests on above test result?
Hey folks, just wanted to add more information here. To me, it would seem like https://github.com/operator-framework/operator-sdk/pull/6222 is a potential fix to this problem, given the error comes from that proxy code.
Granted, this has since moved to this repo, so the equivalent would be here: https://github.com/operator-framework/ansible-operator-plugins/blob/main/internal/ansible/proxy/inject_owner.go#L86-L96
From what I can tell, #6222 stalled because a proper test case wasn't found. Based on my testing, you can just stand up an APIService with an invalid service reference and it should immediately trigger this issue.
E.g.
apiVersion: apiregistration.k8s.io/v1
kind: APIService
metadata:
name: v1alpha1.example.com
spec:
caBundle: 'Zm9vCg=='
group: example.com
groupPriorityMinimum: 1000
service:
name: example-api
namespace: non-existent
port: 443
version: v1alpha1
versionPriority: 15
The APIServer accepts this, but it immediately becomes unavailable because the underlying service is not found.
I'll leave it up to maintainers what they want to do with this information, or if they want to take #6222 and replicate it over in the ansible-operator-plugins repository.
Type of question
Question
What did you do?
There is a partner using ansbile operator 1.34-2, when they tried to deploy their CNF, the following error occurred.
This is the API being called from ansible code:
They noticed these two apiservices are in
False
orFailedDiscoveryCheck
state:If they removed these two apiservices then the CNF deployment worked fine.
They said that they did not observe any error in
ansible-operator v1.31
when there are some apiservices inFalse
state. Are there any new changes inansible-operator v1.34.2
that triggered this issue? Is it needed for all apiservices to be inTrue
state now?What did you expect to see?
CNF to be deployed without this error
"Failed to create object: b'Unable to determine if virtual resource
What did you see instead? Under which circumstances?
any ansible task used by the operator through the ansible K8s module, throwing the error.
Environment
Operator type:
ansible-operator 1.34-2
Kubernetes cluster type:
Google GKE
$ operator-sdk version
ansbile-operator 1.34-2
$ go version
(if language is Go)NA
$ kubectl version
v1.29.3Additional context
Some existing issues reported but there is no solution but advised to fix the cluster health or removed apiservices.
https://access.redhat.com/solutions/6813781
https://bugzilla.redhat.com/show_bug.cgi?id=2063774
https://github.com/operator-framework/operator-sdk/issues/5596
https://github.com/operator-framework/operator-sdk/pull/6222