Open reddyblokesh opened 4 months ago
Hi @reddyblokesh thanks for reporting!
Be sure to check out the docs and the Contributing Guidelines while you wait for a human to take a look at this :slightly_smiling_face:
Cheers!
Addtional information : We pre-built an image by means of doing below steps :
Base image for Alpine with NGINX Plus and FIPS
) to add below lines
# forward request and error logs to docker log collector
&& ln -svf /dev/stdout /var/log/nginx/access.log \
&& ln -svf /dev/stderr /var/log/nginx/error.log
Verify.go:85] Unable to fetch version: error getting client: Get "http://config-version/configVersion": dial unix /var/lib/nginx/nginx-config-version.sock: connect: no such file or directory
Note - with 3.5.2 nginx-ingress (pre-built using alpine-image-plus-fips
) we currently using, it is stable and working fine. When upgrading from 3.5.2 to 3.6.1, please help to confirm what all config changes we should know.
Hi, I noticed that you were trying to use FIPS in 3.6.x. We are having some issues with FIPS image as stated in our release notes https://docs.nginx.com/nginx-ingress-controller/releases/#361 and release logs https://github.com/nginxinc/kubernetes-ingress/releases/tag/v3.6.1 Can you try to pull the 3.6.1 FIPS image from our registry directly and see if it works for you? If our image works for you, but you would like to use your customized version, maybe you could try building a new image with ours as base, like this:
FROM <the published 3.6.1 image>
USER root
RUN ln -svf /dev/stdout /var/log/nginx/access.log \
&& ln -svf /dev/stderr /var/log/nginx/error.log
USER 101
Hi, I noticed that you were trying to use FIPS in 3.6.x. We are having some issues with FIPS image as stated in our release notes https://docs.nginx.com/nginx-ingress-controller/releases/#361 and release logs https://github.com/nginxinc/kubernetes-ingress/releases/tag/v3.6.1 Can you try to pull the 3.6.1 FIPS image from our registry directly and see if it works for you? If our image works for you, but you would like to use your customized version, maybe you could try building a new image with ours as base, like this:
FROM <the published 3.6.1 image> USER root RUN ln -svf /dev/stdout /var/log/nginx/access.log \ && ln -svf /dev/stderr /var/log/nginx/error.log USER 101
Hello @haywoodsh : yes we are trying to use FIPS in 3.6.x and even 3.5.2. to clarify, after building an image using this step - make alpine-image-plus-fips PREFIX=nginx-ingress TARGET=container TAG=3.6.1_${sha1}
, just add below steps correct ?
FROM <the published 3.6.1 image>
USER root
RUN ln -svf /dev/stdout /var/log/nginx/access.log \
&& ln -svf /dev/stderr /var/log/nginx/error.log
USER 101
Never mind @haywoodsh : understood and it is now working. So, an issue is with building an image directly from github (https://github.com/nginxinc/kubernetes-ingress with tag 3.6.0 or 3.6.1), instead pull an image from private-nginx registry. So this works! Please let us know when this will be fixed in repository ?
Update : We observe that 3.6.0 image from private-nginx registry is working however 3.6.1 is not.
As per an update from https://github.com/nginxinc/kubernetes-ingress/issues/5981, we were told that building an FIPS enabled image directly from the repository, i.e https://github.com/nginxinc/kubernetes-ingress broke things up, and told to "fetch" or "pull" FIPS image from private nginx registry (private-registry.nginx.com) using Dockerfile. In the dockerfile, add "FROM" to pull an image from private-registry.nginx.com and add the customization especially stdout/stderr for /var/log/nginx. Based on verification, 3.6.0 is working however, 3.6.1 is not. We tried both from build an image from repository as well as pull an image from private-registry.nginx.com.
I managed to deploy the 3.6.1 FIPS image private-registry.nginx.com/nginx-ic/nginx-plus-ingress:3.6.1-alpine-fips
on my local cluster. Below are my deployment specifications and logs. Could you share yours as well? Additionally, have you tried to deploy the published image directly, rather than as a base image for customization?
kubectl describe deployment -n nginx-ingress nginx-ingress
Name: nginx-ingress
Namespace: nginx-ingress
CreationTimestamp: Tue, 09 Jul 2024 11:51:08 +0100
Labels: <none>
Annotations: deployment.kubernetes.io/revision: 3
Selector: app=nginx-ingress
Replicas: 1 desired | 1 updated | 1 total | 1 available | 0 unavailable
StrategyType: RollingUpdate
MinReadySeconds: 0
RollingUpdateStrategy: 25% max unavailable, 25% max surge
Pod Template:
Labels: app=nginx-ingress
app.kubernetes.io/name=nginx-ingress
Annotations: prometheus.io/port: 9113
prometheus.io/scheme: http
prometheus.io/scrape: true
Service Account: nginx-ingress
Containers:
nginx-plus-ingress:
Image: private-registry.nginx.com/nginx-ic/nginx-plus-ingress:3.6.1-alpine-fips
Ports: 80/TCP, 443/TCP, 8081/TCP, 9113/TCP, 9114/TCP
Host Ports: 0/TCP, 0/TCP, 0/TCP, 0/TCP, 0/TCP
Args:
-nginx-plus
-nginx-configmaps=$(POD_NAMESPACE)/nginx-config
Requests:
cpu: 100m
memory: 128Mi
Readiness: http-get http://:readiness-port/nginx-ready delay=0s timeout=1s period=1s #success=1 #failure=3
Environment:
POD_NAMESPACE: (v1:metadata.namespace)
POD_NAME: (v1:metadata.name)
Mounts: <none>
Volumes: <none>
Node-Selectors: <none>
Tolerations: <none>
Conditions:
Type Status Reason
---- ------ ------
Available True MinimumReplicasAvailable
Progressing True NewReplicaSetAvailable
OldReplicaSets: nginx-ingress-7479879679 (0/0 replicas created), nginx-ingress-9fd5547c8 (0/0 replicas created)
NewReplicaSet: nginx-ingress-79c6b5f9b5 (1/1 replicas created)
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal ScalingReplicaSet 7m40s deployment-controller Scaled up replica set nginx-ingress-7479879679 to 1
Normal ScalingReplicaSet 5m6s deployment-controller Scaled up replica set nginx-ingress-9fd5547c8 to 1
Normal ScalingReplicaSet 5m5s deployment-controller Scaled down replica set nginx-ingress-7479879679 to 0 from 1
Normal ScalingReplicaSet 3m40s deployment-controller Scaled up replica set nginx-ingress-79c6b5f9b5 to 1
Normal ScalingReplicaSet 3m39s deployment-controller Scaled down replica set nginx-ingress-9fd5547c8 to 0 from 1
kubectl logs -f -n nginx-ingress nginx-ingress-79c6b5f9b5-d4jvq
NGINX Ingress Controller Version=3.6.1 Commit=aec5debf08c140a8d5d97f3fc596061aa756e9b0 Date=2024-07-04T08:41:26Z DirtyState=false Arch=linux/arm64 Go=go1.22.5
I0709 10:55:08.538374 1 flags.go:321] Starting with flags: ["-nginx-plus" "-nginx-configmaps=nginx-ingress/nginx-config"]
I0709 10:55:08.542246 1 main.go:292] Kubernetes version: 1.28.8
I0709 10:55:08.546405 1 main.go:437] Using nginx version: nginx/1.25.5 (nginx-plus-r32)
I0709 10:55:08.556058 1 main.go:868] Pod label updated: nginx-ingress-79c6b5f9b5-d4jvq
2024/07/09 10:55:08 [notice] 18#18: using the "epoll" event method
2024/07/09 10:55:08 [notice] 18#18: OpenSSL FIPS Mode is enabled
2024/07/09 10:55:08 [notice] 18#18: nginx/1.25.5 (nginx-plus-r32)
2024/07/09 10:55:08 [notice] 18#18: built by gcc 13.2.1 20231014 (Alpine 13.2.1_git20231014)
2024/07/09 10:55:08 [notice] 18#18: OS: Linux 6.6.31-linuxkit
2024/07/09 10:55:08 [notice] 18#18: getrlimit(RLIMIT_NOFILE): 1048576:1048576
2024/07/09 10:55:08 [notice] 18#18: start worker processes
2024/07/09 10:55:08 [notice] 18#18: start worker process 19
2024/07/09 10:55:08 [notice] 18#18: start worker process 20
2024/07/09 10:55:08 [notice] 18#18: start worker process 21
2024/07/09 10:55:08 [notice] 18#18: start worker process 22
2024/07/09 10:55:08 [notice] 18#18: start worker process 23
Hello @haywoodsh : Here is what we did was :
FROM private-registry.nginx.com/nginx-ic/nginx-plus-ingress:3.6.1-alpine-fips
USER root
RUN ln -svf /dev/stdout /var/log/nginx/access.log \
&& ln -svf /dev/stderr /var/log/nginx/error.log
USER 101
And we encountered below error.
Verify.go:85] Unable to fetch version: error getting client: Get "http://config-version/configVersion": dial unix /var/lib/nginx/nginx-config-version.sock: connect: no such file or directory
Regarding helm chart, we used 3.6.1 helm chart (without CRD), and we used "overlay" like how we did with 3.5.2 which is currently working, today we tried 3.6.0 and it works well. Using the same config we used for 3.5.2 and 3.6.0, this did not work for 3.6.1 (helm chart 3.6.1 + published image). Below is the piece of the log.
NGINX Ingress Controller Version=3.6.1 Commit=67ef4d92fae250fc916f4de5bd667db76551958e Date=2024-06-26T08:09:33Z DirtyState=true Arch=linux/amd64 Go=go1.22.4
I0708 06:49:21.793291 1 flags.go:321] Starting with flags: ["-nginx-plus=true" "-nginx-reload-timeout=60000" "-enable-app-protect=false" "-enable-app-protect-dos=false" "-nginx-configmaps=nginx-dev/nginx-ingress-dev" "-default-server-tls-secret=nginx-dev/nginx-ingress-dev-default-server-tls" "-ingress-class=nginx-dev" "-watch-namespace=nginx-dev" "-health-status=true" "-health-status-uri=/_nginx-health" "-nginx-debug=false" "-v=1" "-nginx-status=true" "-nginx-status-port=8080" "-nginx-status-allow-cidrs=127.0.0.1" "-report-ingress-status" "-external-service=nginx-ingress-dev-controller" "-enable-leader-election=true" "-leader-election-lock-name=nginx-ingress-leader" "-enable-prometheus-metrics=true" "-prometheus-metrics-listen-port=9113" "-prometheus-tls-secret=" "-enable-service-insight=false" "-service-insight-listen-port=9114" "-service-insight-tls-secret=" "-enable-custom-resources=false" "-enable-snippets=true" "-include-year=false" "-disable-ipv6=false" "-ready-status=true" "-ready-status-port=8081" "-enable-latency-metrics=true" "-ssl-dynamic-reload=true" "-enable-telemetry-reporting=false" "-weight-changes-dynamic-reload=true"]
I0708 06:49:21.793362 1 flags.go:337] Namespaces watched: [nginx-dev]
I0708 06:49:21.804208 1 main.go:292] Kubernetes version: 1.26.13
I0708 06:49:21.817067 1 main.go:437] Using nginx version: nginx/1.25.5 (nginx-plus-r32)
I0708 06:49:22.032810 1 main.go:868] Pod label updated: nginx-ingress-dev-controller-f448847fd-cb999
I0708 06:49:22.032810 1 Verify.go:85] Unable to fetch version: error getting client: Get "http://config-version/configVersion": dial unix /var/lib/nginx/nginx-config-version.sock: connect: no such file or directory
The NGINX team has been taking a deeper look at this and it is related to an OpenSSL update that has been first taken into Alpine.
As noted, paid customers have access to container images that are built by the NGINX Ingress Controller team.
This has been specifically modified to remove the breaking OpenSSL change and is available.
https://github.com/alpinelinux/docker-alpine/issues/406
According to https://github.com/openssl/openssl/issues/24826 it appears that a fix is coming from OpenSSL
I was not able to reproduce the error. I built nginx-plus-ingress:3.6.1-alpine-fips-custom-log
from our official image private-registry.nginx.com/nginx-ic/nginx-plus-ingress:3.6.1-alpine-fips
with the log changes, and it runs fine on my local cluster. @alnhk Did the unmodified official image work for you, or did it display the same error message as well?
Name: nginx-ingress-78c84496bb-n48bl
Namespace: nginx-ingress
Priority: 0
Service Account: nginx-ingress
Node: k3d-gitops-server-0/172.18.0.3
Start Time: Wed, 10 Jul 2024 12:13:20 +0100
Labels: app=nginx-ingress
app.kubernetes.io/name=nginx-ingress
app.kubernetes.io/version=3.6.1
app.nginx.org/version=1.25.5-nginx-plus-r32
pod-template-hash=78c84496bb
Annotations: prometheus.io/port: 9113
prometheus.io/scheme: http
prometheus.io/scrape: true
Status: Running
SeccompProfile: RuntimeDefault
IP: 10.42.0.182
IPs:
IP: 10.42.0.182
Controlled By: ReplicaSet/nginx-ingress-78c84496bb
Containers:
nginx-plus-ingress:
Container ID: containerd://97da4384bfcdac52af09de58188cd11432031244a8c5f8d0828dd94f7ac72552
Image: private-registry.nginx.com/nginx-ic/nginx-plus-ingress:3.6.1-alpine-fips-custom-log
Image ID: sha256:668e91fce4ab70509ba1f20e4aeedd5d5c39c3679333df49782bc158b8edf945
Ports: 80/TCP, 443/TCP, 8081/TCP, 9113/TCP, 9114/TCP
Host Ports: 0/TCP, 0/TCP, 0/TCP, 0/TCP, 0/TCP
Args:
-nginx-plus
-nginx-configmaps=$(POD_NAMESPACE)/nginx-config
-nginx-reload-timeout=60000
-weight-changes-dynamic-reload=true
-enable-oidc
State: Running
Started: Wed, 10 Jul 2024 12:13:21 +0100
Ready: True
Restart Count: 0
Requests:
cpu: 100m
memory: 128Mi
Readiness: http-get http://:readiness-port/nginx-ready delay=0s timeout=1s period=1s #success=1 #failure=3
Environment:
POD_NAMESPACE: nginx-ingress (v1:metadata.namespace)
POD_NAME: nginx-ingress-78c84496bb-n48bl (v1:metadata.name)
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-t65c5 (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
kube-api-access-t65c5:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 9m28s default-scheduler Successfully assigned nginx-ingress/nginx-ingress-78c84496bb-n48bl to k3d-gitops-server-0
Normal Pulled 9m28s kubelet Container image "private-registry.nginx.com/nginx-ic/nginx-plus-ingress:3.6.1-alpine-fips-custom-log" already present on machine
Normal Created 9m28s kubelet Created container nginx-plus-ingress
Normal Started 9m28s kubelet Started container nginx-plus-ingress
NGINX Ingress Controller Version=3.6.1 Commit=aec5debf08c140a8d5d97f3fc596061aa756e9b0 Date=2024-07-04T08:41:26Z DirtyState=false Arch=linux/arm64 Go=go1.22.5
I0710 13:02:24.193447 1 flags.go:321] Starting with flags: ["-nginx-plus" "-nginx-configmaps=nginx-ingress/nginx-config" "-nginx-reload-timeout=60000" "-weight-changes-dynamic-reload=true"]
I0710 13:02:24.200757 1 main.go:292] Kubernetes version: 1.28.8
I0710 13:02:24.231366 1 main.go:437] Using nginx version: nginx/1.25.5 (nginx-plus-r32)
I0710 13:02:24.242747 1 main.go:868] Pod label updated: nginx-ingress-5c45cd989b-svlvp
2024/07/10 13:02:24 [notice] 19#19: using the "epoll" event method
2024/07/10 13:02:24 [notice] 19#19: OpenSSL FIPS Mode is enabled
2024/07/10 13:02:24 [notice] 19#19: nginx/1.25.5 (nginx-plus-r32)
2024/07/10 13:02:24 [notice] 19#19: built by gcc 13.2.1 20231014 (Alpine 13.2.1_git20231014)
2024/07/10 13:02:24 [notice] 19#19: OS: Linux 6.6.31-linuxkit
2024/07/10 13:02:24 [notice] 19#19: getrlimit(RLIMIT_NOFILE): 1048576:1048576
I was not able to reproduce the error. I built
nginx-plus-ingress:3.6.1-alpine-fips-custom-log
from our official imageprivate-registry.nginx.com/nginx-ic/nginx-plus-ingress:3.6.1-alpine-fips
with the log changes, and it runs fine on my local cluster. @alnhk Did the unmodified official image work for you, or did it display the same error message as well?Name: nginx-ingress-78c84496bb-n48bl Namespace: nginx-ingress Priority: 0 Service Account: nginx-ingress Node: k3d-gitops-server-0/172.18.0.3 Start Time: Wed, 10 Jul 2024 12:13:20 +0100 Labels: app=nginx-ingress app.kubernetes.io/name=nginx-ingress app.kubernetes.io/version=3.6.1 app.nginx.org/version=1.25.5-nginx-plus-r32 pod-template-hash=78c84496bb Annotations: prometheus.io/port: 9113 prometheus.io/scheme: http prometheus.io/scrape: true Status: Running SeccompProfile: RuntimeDefault IP: 10.42.0.182 IPs: IP: 10.42.0.182 Controlled By: ReplicaSet/nginx-ingress-78c84496bb Containers: nginx-plus-ingress: Container ID: containerd://97da4384bfcdac52af09de58188cd11432031244a8c5f8d0828dd94f7ac72552 Image: private-registry.nginx.com/nginx-ic/nginx-plus-ingress:3.6.1-alpine-fips-custom-log Image ID: sha256:668e91fce4ab70509ba1f20e4aeedd5d5c39c3679333df49782bc158b8edf945 Ports: 80/TCP, 443/TCP, 8081/TCP, 9113/TCP, 9114/TCP Host Ports: 0/TCP, 0/TCP, 0/TCP, 0/TCP, 0/TCP Args: -nginx-plus -nginx-configmaps=$(POD_NAMESPACE)/nginx-config -nginx-reload-timeout=60000 -weight-changes-dynamic-reload=true -enable-oidc State: Running Started: Wed, 10 Jul 2024 12:13:21 +0100 Ready: True Restart Count: 0 Requests: cpu: 100m memory: 128Mi Readiness: http-get http://:readiness-port/nginx-ready delay=0s timeout=1s period=1s #success=1 #failure=3 Environment: POD_NAMESPACE: nginx-ingress (v1:metadata.namespace) POD_NAME: nginx-ingress-78c84496bb-n48bl (v1:metadata.name) Mounts: /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-t65c5 (ro) Conditions: Type Status Initialized True Ready True ContainersReady True PodScheduled True Volumes: kube-api-access-t65c5: Type: Projected (a volume that contains injected data from multiple sources) TokenExpirationSeconds: 3607 ConfigMapName: kube-root-ca.crt ConfigMapOptional: <nil> DownwardAPI: true QoS Class: Burstable Node-Selectors: <none> Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s node.kubernetes.io/unreachable:NoExecute op=Exists for 300s Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 9m28s default-scheduler Successfully assigned nginx-ingress/nginx-ingress-78c84496bb-n48bl to k3d-gitops-server-0 Normal Pulled 9m28s kubelet Container image "private-registry.nginx.com/nginx-ic/nginx-plus-ingress:3.6.1-alpine-fips-custom-log" already present on machine Normal Created 9m28s kubelet Created container nginx-plus-ingress Normal Started 9m28s kubelet Started container nginx-plus-ingress
NGINX Ingress Controller Version=3.6.1 Commit=aec5debf08c140a8d5d97f3fc596061aa756e9b0 Date=2024-07-04T08:41:26Z DirtyState=false Arch=linux/arm64 Go=go1.22.5 I0710 13:02:24.193447 1 flags.go:321] Starting with flags: ["-nginx-plus" "-nginx-configmaps=nginx-ingress/nginx-config" "-nginx-reload-timeout=60000" "-weight-changes-dynamic-reload=true"] I0710 13:02:24.200757 1 main.go:292] Kubernetes version: 1.28.8 I0710 13:02:24.231366 1 main.go:437] Using nginx version: nginx/1.25.5 (nginx-plus-r32) I0710 13:02:24.242747 1 main.go:868] Pod label updated: nginx-ingress-5c45cd989b-svlvp 2024/07/10 13:02:24 [notice] 19#19: using the "epoll" event method 2024/07/10 13:02:24 [notice] 19#19: OpenSSL FIPS Mode is enabled 2024/07/10 13:02:24 [notice] 19#19: nginx/1.25.5 (nginx-plus-r32) 2024/07/10 13:02:24 [notice] 19#19: built by gcc 13.2.1 20231014 (Alpine 13.2.1_git20231014) 2024/07/10 13:02:24 [notice] 19#19: OS: Linux 6.6.31-linuxkit 2024/07/10 13:02:24 [notice] 19#19: getrlimit(RLIMIT_NOFILE): 1048576:1048576
Hello @haywoodsh : Did the unmodified official image work for you
- tried this and this is what i am getting this below result, anyways, will wait for the confirmation as per @brianehlert ..
https://github.com/nginxinc/kubernetes-ingress/issues/5981#issuecomment-2217743753
@alnhk yes this means the published official one is working as expected, which you can use if that works for you 👍🏼 @haywoodsh also had the modified one working as you see in the log above. https://github.com/openssl/openssl/issues/24826#issuecomment-2220416732
@vepatel @haywoodsh : After spending some time to research "how and why" its working for you and "not working" for us. We observe that after removing the last line USER 101
, it works perfectly. And after its deployed, we could see that user "101" is being used in the k8s pod.
⎈|example.com:acme-example-com)]# k exec -it nginx-ingress-dev-controller-64cdd8d858-nhmkp -- whoami
101
Interesting, @reddyblokesh let us know if above works for you, we'll keep this open for visibility until openssl issue is resolved @shaun-nx
k exec -it nginx-ingress-test-controller-6888b57b67-h45xc -- whoami 101 whoami: extra operand ‘101’ Try 'whoami --help' for more information. command terminated with exit code 1
i just tried same 3.6.1 charts and used the same image which i am using in our prod . same charts and same image but again fails , so this issue is intermittent why ?? what is causing this issue ? we need to understand the root cause behind this ?
I0918 05:44:42.226666 1 verify.go:85] Unable to fetch version: error getting client: Get "http://config-version/configVersion": dial unix /var/lib/nginx/nginx-config-version.sock: connect: no such file or directory I0918 05:44:42.226717 1 verify.go:85] Unable to fetch version: error getting client: Get "http://config-version/configVersion": dial unix /var/lib/nginx/nginx-config-version.sock: connect: no such file or directory I0918 05:44:42.226763 1 verify.go:85] Unable to fetch version: error getting client: Get "http://config-version/configVersion": dial unix /var/lib/nginx/nginx-config-version.sock: connect: no such file or directory I0918 05:44:42.226807 1 verify.go:85] Unable to fetch version: error getting client: Get "http://config-version/configVersion": dial unix /var/lib/nginx/nginx-config-version.sock: connect: no such file or directory I0918 05:44:42.226852 1 verify.go:85] Unable to fetch version: error getting client: Get "http://config-version/configVersion": dial unix /var/lib/nginx/nginx-config-version.sock: connect: no such file or directory I0918 05:44:42.226899 1 verify.go:85] Unable to fetch version: error getting client: Get "http://config-version/configVersion": dial unix /var/lib/nginx/nginx-config-version.sock: connect: no such file or directory
Describe the bug
To Reproduce Steps to reproduce the behavior:
Expected behavior no errors should be logged rather nginx has to be reloaded
Your environment
Additional context Add any other context about the problem here. Any log files you want to share.