Closed suuugeee closed 2 years ago
@caigy The official operator image may need to be updated.
@z2289181978 As a temporary solution, you can execute make docker-build
to generate a local image, instead of using the official image
@z2289181978作为临时解决方案,您可以执行
make docker-build
生成本地图像,而不是使用官方图像
thanks bro, I'll try what you said。
@gobbq bro,After I try "make docker-build", it still prompts "Error: failed to start container "manager": Error response from daemon: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: exec : "/manager": stat /manager: no such file or directory: unknown”
@gobbq
"apacherocketmq/rocketmq-operator:0.3.0-snapshot" is the image I generated by "make docker-build", but it doesn't work as expected.
@caigy
@suuugeee I've tried make docker-build IMG=apacherocketmq/rocketmq-operator:0.3.0-snapshot
and then deployed it, but couldn't reproduce the issue. Could you provide more detailed information about your operation?
/manager
was not found in your container, please enter your container to check whether it exists. Also, confirm that following messages were printed when the operator image was built:
Step 19/21 : COPY --from=builder /workspace/manager .
---> 854f7b6b7af5
Step 20/21 : USER 65532:65532
---> Running in eddefca4e329
Removing intermediate container eddefca4e329
---> 57456f64ede1
Step 21/21 : ENTRYPOINT ["/manager"]
---> Running in f75b5644e0a8
Removing intermediate container f75b5644e0a8
---> 1e53679155f9
Successfully built 1e53679155f9
Successfully tagged apacherocketmq/rocketmq-operator:0.3.0-snapshot
docker info
. The environment I deployed: ============================================= I ran the following commands then got successful result:
make docker-build IMG=apacherocketmq/rocketmq-operator:0.3.0-snapshot
kubectl create -f deploy/service_account.yaml
kubectl create -f deploy/role.yaml
kubectl create -f deploy/role_binding.yaml
kubectl create -f deploy/operator.yaml
@caigy
码头工人:20.10.17 操作系统:Alibaba Cloud Linux 3,核心版本 5.10.84-10.4.al8.x86_64 CPU架构:x86_64
@suuugeee Could you enter the operator container to check whether /manager exists?
same issue with me
Normal Scheduled 14s default-scheduler Successfully assigned default/rocketmq-operator-6f65c77c49-d488s to k8s-ycloud-worker192.168.101.182-dev
Normal AllocIPSucceed 14s terway-daemon Alloc IP 192.168.30.170/24
Normal Pulling 14s kubelet Pulling image "apacherocketmq/rocketmq-operator:0.3.0-snapshot"
Normal Pulled 1s kubelet Successfully pulled image "apacherocketmq/rocketmq-operator:0.3.0-snapshot" in 12.816837469s
Normal Created 1s kubelet Created container manager
Warning Failed 1s kubelet Error: failed to create containerd task: failed to create shim: OCI runtime create failed: container_linux.go:380: starting container process caused: exec: "/manager": stat /manager: no such file or directory: unknown
bash-4.4$ ls
bin dev etc home lib media mnt opt proc root run sbin srv sys tmp usr var
============== in Alibaba Cloud Linux
@ccctask Thanks for your report, pls also post your docker info
output. It seems that the issue has something to do with OS or docker version.
BTW, is your operator image built on the same environment, or built on machine with different OS or docker version and then transferred to the environment where it deployed?
host:
arch: amd64
buildahVersion: 1.23.1
cgroupControllers:
- cpuset
- cpu
- cpuacct
- blkio
- memory
- devices
- freezer
- net_cls
- perf_event
- net_prio
- hugetlb
- pids
- rdma
cgroupManager: systemd
cgroupVersion: v1
conmon:
package: conmon-2.0.32-1.al8.x86_64
path: /usr/bin/conmon
version: 'conmon version 2.0.32, commit: facef751f675b2441a0cf72606fe08a9110f8838'
cpus: 2
distribution:
distribution: '"alinux"'
version: "3"
eventLogger: file
hostname: iZt4ndjnm00f7wphn4wd6gZ
idMappings:
gidmap: null
uidmap: null
kernel: 5.10.112-11.al8.x86_64
linkmode: dynamic
logDriver: k8s-file
memFree: 1680097280
memTotal: 3906551808
ociRuntime:
name: runc
package: runc-1.0.3-1.al8.x86_64
path: /usr/bin/runc
version: |-
runc version 1.0.3
spec: 1.0.2-dev
go: go1.16.12
libseccomp: 2.5.1
os: linux
remoteSocket:
path: /run/podman/podman.sock
security:
apparmorEnabled: false
capabilities: CAP_NET_RAW,CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
rootless: false
seccompEnabled: true
seccompProfilePath: /usr/share/containers/seccomp.json
selinuxEnabled: false
serviceIsRemote: false
slirp4netns:
executable: /usr/bin/slirp4netns
package: slirp4netns-1.1.8-1.1.al8.x86_64
version: |-
slirp4netns version 1.1.8
commit: d361001f495417b880f20329121e3aa431a8f90f
libslirp: 4.4.0
SLIRP_CONFIG_VERSION_MAX: 3
libseccomp: 2.5.1
swapFree: 0
swapTotal: 0
uptime: 1h 2m 46.96s (Approximately 0.04 days)
plugins:
log:
- k8s-file
- none
- journald
network:
- bridge
- macvlan
volume:
- local
registries:
search:
- registry.fedoraproject.org
- registry.access.redhat.com
- registry.centos.org
- docker.io
store:
configFile: /etc/containers/storage.conf
containerStore:
number: 2
paused: 0
running: 0
stopped: 2
graphDriverName: overlay
graphOptions:
overlay.mountopt: nodev,metacopy=on
graphRoot: /var/lib/containers/storage
graphStatus:
Backing Filesystem: extfs
Native Overlay Diff: "false"
Supports d_type: "true"
Using metacopy: "true"
imageStore:
number: 21
runRoot: /run/containers/storage
volumePath: /var/lib/containers/storage/volumes
version:
APIVersion: 3.4.2
Built: 1647932805
BuiltTime: Tue Mar 22 15:06:45 2022
GitCommit: ""
GoVersion: go1.16.12
OsArch: linux/amd64
Version: 3.4.2
@caigy built on the same OS with docker runtime, setup with contianerd runtime. I have tried using docker as runtime but it didn't solve the this problem
@ccctask Thanks for your reply. I'll try to find an environment with Alibaba Linux and reproduce it. At the same time, you can try removing L44 and L49 in dockerfile and build it again. I guess the problem may be caused by that user.
@ccctask Thanks for your reply. I'll try to find an environment with Alibaba Linux and reproduce it. At the same time, you can try removing L44 and L49 in dockerfile and build it again. I guess the problem may be caused by that user.
Deleting it doesn't seem to work either.
@ccctask @suuugeee Please try replacing USER 65532:65532
with USER root:root
, rebuild image and then check if /manager
can be found. IMO /manager
should exists in operator container (else the docker building would fail), it's probably that this file just can't be shown for privilege problems.
In my environment, /manager
belongs to user root
:
# docker run -it --entrypoint /bin/sh apacherocketmq/rocketmq-operator:0.3.0-snapshot
/ $ ls -al
total 49456
drwxr-xr-x 1 root root 4096 Jul 14 01:44 .
drwxr-xr-x 1 root root 4096 Jul 14 01:44 ..
-rwxr-xr-x 1 root root 0 Jul 14 01:44 .dockerenv
drwxr-xr-x 1 root root 4096 Jul 12 05:01 bin
drwxr-xr-x 5 root root 360 Jul 14 01:44 dev
drwxr-xr-x 1 root root 4096 Jul 14 01:44 etc
drwxr-xr-x 1 root root 4096 Jul 12 05:02 home
drwxr-xr-x 1 root root 4096 May 11 2019 lib
-rwxr-xr-x 1 root root 50576568 Jul 12 05:00 manager
drwxr-xr-x 5 root root 4096 May 9 2019 media
drwxr-xr-x 2 root root 4096 May 9 2019 mnt
drwxr-xr-x 2 root root 4096 May 9 2019 opt
dr-xr-xr-x 193 root root 0 Jul 14 01:44 proc
drwx------ 1 root root 4096 Jul 12 05:40 root
drwxr-xr-x 2 root root 4096 May 9 2019 run
drwxr-xr-x 2 root root 4096 May 9 2019 sbin
drwxr-xr-x 2 root root 4096 May 9 2019 srv
dr-xr-xr-x 13 root root 0 Jul 14 01:44 sys
drwxrwxrwt 2 root root 4096 May 9 2019 tmp
drwxr-xr-x 1 root root 4096 May 11 2019 usr
drwxr-xr-x 1 root root 4096 May 9 2019 var
No user with id 65532 in operator container, this may be the cause. @gobbq
/ $ cat /etc/passwd
root:x:0:0:root:/root:/bin/ash
bin:x:1:1:bin:/bin:/sbin/nologin
daemon:x:2:2:daemon:/sbin:/sbin/nologin
adm:x:3:4:adm:/var/adm:/sbin/nologin
lp:x:4:7:lp:/var/spool/lpd:/sbin/nologin
sync:x:5:0:sync:/sbin:/bin/sync
shutdown:x:6:0:shutdown:/sbin:/sbin/shutdown
halt:x:7:0:halt:/sbin:/sbin/halt
mail:x:8:12:mail:/var/spool/mail:/sbin/nologin
news:x:9:13:news:/usr/lib/news:/sbin/nologin
uucp:x:10:14:uucp:/var/spool/uucppublic:/sbin/nologin
operator:x:11:0:operator:/root:/bin/sh
man:x:13:15:man:/usr/man:/sbin/nologin
postmaster:x:14:12:postmaster:/var/spool/mail:/sbin/nologin
cron:x:16:16:cron:/var/spool/cron:/sbin/nologin
ftp:x:21:21::/var/lib/ftp:/sbin/nologin
sshd:x:22:22:sshd:/dev/null:/sbin/nologin
at:x:25:25:at:/var/spool/cron/atjobs:/sbin/nologin
squid:x:31:31:Squid:/var/cache/squid:/sbin/nologin
xfs:x:33:33:X Font Server:/etc/X11/fs:/sbin/nologin
games:x:35:35:games:/usr/games:/sbin/nologin
postgres:x:70:70::/var/lib/postgresql:/bin/sh
cyrus:x:85:12::/usr/cyrus:/sbin/nologin
vpopmail:x:89:89::/var/vpopmail:/sbin/nologin
ntp:x:123:123:NTP:/var/empty:/sbin/nologin
smmsp:x:209:209:smmsp:/var/spool/mqueue:/sbin/nologin
guest:x:405:100:guest:/dev/null:/sbin/nologin
nobody:x:65534:65534:nobody:/:/sbin/nologin
- kubectl create -f deploy/operator.yaml
I tested your program, and can't press another effect, "manager" disappears.
CASE 1: Docker: 19.03.15, OS: Alibaba Cloud Linux 3 (Soaring Falcon), kernal: 5.10.84-10.2.al8.x86_64
On Alibaba Cloud ACK, the operator image built on the same environment is correct:
# docker run -it --entrypoint /bin/sh apacherocketmq/rocketmq-operator:0.3.0-snapshot
/ $ ls -al
total 49456
drwxr-xr-x 1 root root 4096 Jul 15 09:56 .
drwxr-xr-x 1 root root 4096 Jul 15 09:56 ..
-rwxr-xr-x 1 root root 0 Jul 15 09:56 .dockerenv
drwxr-xr-x 1 root root 4096 Jul 15 09:29 bin
drwxr-xr-x 5 root root 360 Jul 15 09:56 dev
drwxr-xr-x 1 root root 4096 Jul 15 09:56 etc
drwxr-xr-x 1 root root 4096 Jul 15 09:31 home
drwxr-xr-x 1 root root 4096 May 11 2019 lib
-rwxr-xr-x 1 root root 50576560 Jul 15 09:26 manager
drwxr-xr-x 5 root root 4096 May 9 2019 media
drwxr-xr-x 2 root root 4096 May 9 2019 mnt
drwxr-xr-x 2 root root 4096 May 9 2019 opt
dr-xr-xr-x 341 root root 0 Jul 15 09:56 proc
drwx------ 1 root root 4096 Jul 15 09:53 root
drwxr-xr-x 2 root root 4096 May 9 2019 run
drwxr-xr-x 2 root root 4096 May 9 2019 sbin
drwxr-xr-x 2 root root 4096 May 9 2019 srv
dr-xr-xr-x 13 root root 0 Jul 15 09:56 sys
drwxrwxrwt 2 root root 4096 May 9 2019 tmp
drwxr-xr-x 1 root root 4096 May 11 2019 usr
drwxr-xr-x 1 root root 4096 May 9 2019 var
Outputs of docker info
:
# docker info
Client:
Debug Mode: false
Server:
Containers: 44
Running: 41
Paused: 0
Stopped: 3
Images: 42
Server Version: 19.03.15
Storage Driver: overlay2
Backing Filesystem: extfs
Supports d_type: true
Native Overlay Diff: false
Logging Driver: json-file
Cgroup Driver: systemd
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: d71fcd7d8303cbf684402823e425e9dd2e99285d
runc version: b9ee9c6314599f1b4a7f497e1f1f856fe433d3b7
init version: fec3683
Security Options:
seccomp
Profile: default
Kernel Version: 5.10.84-10.2.al8.x86_64
Operating System: Alibaba Cloud Linux 3 (Soaring Falcon)
OSType: linux
Architecture: x86_64
CPUs: 8
Total Memory: 14.86GiB
Name: iZwz91yvcvq6jqu4j4qq3lZ
ID: KMVK:J4QP:YEFV:ZXHD:PYAJ:DZX5:T2YE:BWKX:TLBY:4YPN:A442:OWNX
Docker Root Dir: /var/lib/docker
Debug Mode: false
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
127.0.0.0/8
Registry Mirrors:
https://pqbap4ya.mirror.aliyuncs.com/
Live Restore Enabled: true
CASE 2: Docker: 20.10.17 OS: Alibaba Cloud Linux 3 (Soaring Falcon), Kernel: 5.10.112-11.al8.x86_64
[root@iZwz9c0gh4hpp8cwwuovsvZ rocketmq-operator-master]# docker run -it --entrypoint /bin/sh apacherocketmq/rocketmq-operator:0.3.0-snapshot
/ $ ls -al
total 49472
drwxr-xr-x 1 root root 4096 Jul 16 09:29 .
drwxr-xr-x 1 root root 4096 Jul 16 09:29 ..
-rwxr-xr-x 1 root root 0 Jul 16 09:29 .dockerenv
drwxr-xr-x 1 root root 4096 Jul 16 09:20 bin
drwxr-xr-x 5 root root 360 Jul 16 09:29 dev
drwxr-xr-x 1 root root 4096 Jul 16 09:29 etc
drwxr-xr-x 1 root root 4096 Jul 16 09:21 home
drwxr-xr-x 1 root root 4096 May 11 2019 lib
-rwxr-xr-x 1 root root 50573577 Jul 16 09:18 manager
drwxr-xr-x 5 root root 4096 May 9 2019 media
drwxr-xr-x 2 root root 4096 May 9 2019 mnt
drwxr-xr-x 2 root root 4096 May 9 2019 opt
dr-xr-xr-x 217 root root 0 Jul 16 09:29 proc
drwx------ 1 root root 4096 Jul 16 09:25 root
drwxr-xr-x 2 root root 4096 May 9 2019 run
drwxr-xr-x 2 root root 4096 May 9 2019 sbin
drwxr-xr-x 2 root root 4096 May 9 2019 srv
dr-xr-xr-x 13 root root 0 Jul 16 09:29 sys
drwxrwxrwt 2 root root 4096 May 9 2019 tmp
drwxr-xr-x 1 root root 4096 May 11 2019 usr
drwxr-xr-x 1 root root 4096 May 9 2019 var
Outputs of docker info
:
[root@iZwz9c0gh4hpp8cwwuovsvZ rocketmq-operator-master]# docker info
Client:
Context: default
Debug Mode: false
Plugins:
app: Docker App (Docker Inc., v0.9.1-beta3)
buildx: Docker Buildx (Docker Inc., v0.8.2-docker)
scan: Docker Scan (Docker Inc., v0.17.0)
Server:
Containers: 1
Running: 0
Paused: 0
Stopped: 1
Images: 21
Server Version: 20.10.17
Storage Driver: overlay2
Backing Filesystem: extfs
Supports d_type: true
Native Overlay Diff: false
userxattr: false
Logging Driver: json-file
Cgroup Driver: cgroupfs
Cgroup Version: 1
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive
Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 10c12954828e7c7c9b6e0ea9b0c02b01407d3ae1
runc version: v1.1.2-0-ga916309
init version: de40ad0
Security Options:
seccomp
Profile: default
Kernel Version: 5.10.112-11.al8.x86_64
Operating System: Alibaba Cloud Linux 3 (Soaring Falcon)
OSType: linux
Architecture: x86_64
CPUs: 4
Total Memory: 7.307GiB
Name: iZwz9c0gh4hpp8cwwuovsvZ
ID: 4WQY:WHTM:7DUQ:TAUH:64JT:UNTM:C62U:MJZB:QLZO:7LWL:NJLR:Z73O
Docker Root Dir: /var/lib/docker
Debug Mode: false
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
CASE 3: Emulate Docker CLI using podman 3.4.2(runc version 1.0.3) OS: Alibaba Cloud Linux 3 (Soaring Falcon), Kernel: 5.10.112-11.al8.x86_64
# docker run -it --entrypoint /bin/sh apacherocketmq/rocketmq-operator:0.3.0-snapshot
Emulate Docker CLI using podman. Create /etc/containers/nodocker to quiet msg.
~ $ ls -al
total 49468
dr-xr-xr-x 1 root root 4096 Jul 16 10:38 .
dr-xr-xr-x 1 root root 4096 Jul 16 10:38 ..
drwxr-xr-x 1 root root 4096 Jul 16 10:28 bin
drwxr-xr-x 5 root root 360 Jul 16 10:38 dev
drwxr-xr-x 1 root root 4096 Jul 16 10:37 etc
drwxr-xr-x 1 root root 4096 Jul 16 10:31 home
drwxr-xr-x 1 root root 4096 May 11 2019 lib
-rwxr-xr-x 1 root root 50573577 Jul 16 10:24 manager
drwxr-xr-x 5 root root 4096 May 9 2019 media
drwxr-xr-x 2 root root 4096 May 9 2019 mnt
drwxr-xr-x 2 root root 4096 May 9 2019 opt
dr-xr-xr-x 201 root root 0 Jul 16 10:38 proc
drwx------ 1 root root 4096 Jul 16 10:37 root
drwxr-xr-x 1 root root 4096 Jul 16 10:25 run
drwxr-xr-x 2 root root 4096 May 9 2019 sbin
drwxr-xr-x 2 root root 4096 May 9 2019 srv
dr-xr-xr-x 13 root root 0 Jul 16 10:38 sys
drwxrwxrwt 2 root root 4096 May 9 2019 tmp
drwxr-xr-x 1 root root 4096 May 11 2019 usr
drwxr-xr-x 1 root root 4096 May 9 2019 var
Outputs of docker info
:
# docker info
Emulate Docker CLI using podman. Create /etc/containers/nodocker to quiet msg.
host:
arch: amd64
buildahVersion: 1.23.1
cgroupControllers:
- cpuset
- cpu
- cpuacct
- blkio
- memory
- devices
- freezer
- net_cls
- perf_event
- net_prio
- hugetlb
- pids
- rdma
cgroupManager: systemd
cgroupVersion: v1
conmon:
package: conmon-2.0.32-1.al8.x86_64
path: /usr/bin/conmon
version: 'conmon version 2.0.32, commit: facef751f675b2441a0cf72606fe08a9110f8838'
cpus: 4
distribution:
distribution: '"alinux"'
version: "3"
eventLogger: file
hostname: iZwz9c0gh4hpp8cwwuovsvZ
idMappings:
gidmap: null
uidmap: null
kernel: 5.10.112-11.al8.x86_64
linkmode: dynamic
logDriver: k8s-file
memFree: 2189160448
memTotal: 7845326848
ociRuntime:
name: runc
package: runc-1.0.3-1.al8.x86_64
path: /usr/bin/runc
version: |-
runc version 1.0.3
spec: 1.0.2-dev
go: go1.16.12
libseccomp: 2.5.1
os: linux
remoteSocket:
path: /run/podman/podman.sock
security:
apparmorEnabled: false
capabilities: CAP_NET_RAW,CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
rootless: false
seccompEnabled: true
seccompProfilePath: /usr/share/containers/seccomp.json
selinuxEnabled: false
serviceIsRemote: false
slirp4netns:
executable: /usr/bin/slirp4netns
package: slirp4netns-1.1.8-1.1.al8.x86_64
version: |-
slirp4netns version 1.1.8
commit: d361001f495417b880f20329121e3aa431a8f90f
libslirp: 4.4.0
SLIRP_CONFIG_VERSION_MAX: 3
libseccomp: 2.5.1
swapFree: 0
swapTotal: 0
uptime: 23h 27m 47.25s (Approximately 0.96 days)
plugins:
log:
- k8s-file
- none
- journald
network:
- bridge
- macvlan
volume:
- local
registries:
search:
- registry.fedoraproject.org
- registry.access.redhat.com
- registry.centos.org
- docker.io
store:
configFile: /etc/containers/storage.conf
containerStore:
number: 1
paused: 0
running: 0
stopped: 1
graphDriverName: overlay
graphOptions:
overlay.mountopt: nodev,metacopy=on
graphRoot: /var/lib/containers/storage
graphStatus:
Backing Filesystem: extfs
Native Overlay Diff: "false"
Supports d_type: "true"
Using metacopy: "true"
imageStore:
number: 21
runRoot: /run/containers/storage
volumePath: /var/lib/containers/storage/volumes
version:
APIVersion: 3.4.2
Built: 1647932805
BuiltTime: Tue Mar 22 15:06:45 2022
GitCommit: ""
GoVersion: go1.16.12
OsArch: linux/amd64
Version: 3.4.2
@caigy After I run the container, I found "manager", but kubectl get pod still prompts an error.
Client: Context: default Debug Mode: false Plugins: app: Docker App (Docker Inc., v0.9.1-beta3) buildx: Docker Buildx (Docker Inc., v0.8.2-docker) scan: Docker Scan (Docker Inc., v0.17.0)
Server: Containers: 37 Running: 32 Paused: 0 Stopped: 5 Images: 30 Server Version: 20.10.17 Storage Driver: overlay2 Backing Filesystem: xfs Supports d_type: true Native Overlay Diff: false userxattr: false Logging Driver: json-file Cgroup Driver: systemd Cgroup Version: 1 Plugins: Volume: local Network: bridge host ipvlan macvlan null overlay Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog Swarm: inactive Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux runc Default Runtime: runc Init Binary: docker-init containerd version: 10c12954828e7c7c9b6e0ea9b0c02b01407d3ae1 runc version: v1.1.2-0-ga916309 init version: de40ad0 Security Options: seccomp Profile: default Kernel Version: 5.10.112-11.al8.x86_64 Operating System: Alibaba Cloud Linux 3 (Soaring Falcon) OSType: linux Architecture: x86_64 CPUs: 4 Total Memory: 7.494GiB Name: dddmaster ID: LQHY:R4AW:LE6F:YETD:VIM6:PWYC:GFBU:MFBR:3DX4:E4T4:5YNY:6K7V Docker Root Dir: /var/lib/docker Debug Mode: false Registry: https://index.docker.io/v1/ Labels: Experimental: false Insecure Registries: 127.0.0.0/8 Registry Mirrors: https://kn0t2bca.mirror.aliyuncs.com/ Live Restore Enabled: false
Is there a problem with kubernetes, my kubernetes version is v1.23.8
@suuugeee Did you build rocketmq operator image by docker build
and run it on containerd runtime?
@caigy I made pictures according to "README.md", is it your steps
/www/software2/k8s/rocketmq-operator-master/bin/controller-gen rbac:roleName=manager-role crd webhook paths="./..." output:dir=deploy output:crd:artifacts:config=deploy/crds /www/software2/k8s/rocketmq-operator-master/bin/controller-gen object:headerFile="hack/boilerplate.go.txt" paths="./..." go fmt ./... pkg/apis/rocketmq/v1alpha1/zz_generated.deepcopy.go go vet ./... KUBEBUILDER_ASSETS="/root/.local/share/kubebuilder-envtest/k8s/1.22.1-linux-amd64" go test ./... -coverprofile cover.out ? github.com/apache/rocketmq-operator [no test files] ? github.com/apache/rocketmq-operator/pkg/apis/rocketmq [no test files] ? github.com/apache/rocketmq-operator/pkg/apis/rocketmq/v1alpha1 [no test files] ? github.com/apache/rocketmq-operator/pkg/constants [no test files] ? github.com/apache/rocketmq-operator/pkg/controller/broker [no test files] ? github.com/apache/rocketmq-operator/pkg/controller/console [no test files] ? github.com/apache/rocketmq-operator/pkg/controller/nameservice [no test files] ? github.com/apache/rocketmq-operator/pkg/controller/topictransfer [no test files] ? github.com/apache/rocketmq-operator/pkg/share [no test files] ? github.com/apache/rocketmq-operator/pkg/tool [no test files] ? github.com/apache/rocketmq-operator/version [no test files] docker build -t apacherocketmq/rocketmq-operator:0.3.0-snapshot . Sending build context to Docker daemon 83.9MB Step 1/21 : FROM golang:1.16 as builder ---> 8ffb179c0658 Step 2/21 : WORKDIR /workspace ---> Using cache ---> 8fb722f32cff Step 3/21 : COPY go.mod go.mod ---> Using cache ---> 1b0e9038dcb4 Step 4/21 : COPY go.sum go.sum ---> Using cache ---> ad1e3e42fe7a Step 5/21 : RUN go env -w GO111MODULE=on ---> Using cache ---> f76376549aed Step 6/21 : RUN go env -w GOPROXY=https://mirrors.aliyun.com/goproxy,direct ---> Using cache ---> 65c94bbc8d77 Step 7/21 : RUN go mod download ---> Using cache ---> e8c2c5f2efc9 Step 8/21 : COPY main.go main.go ---> Using cache ---> 415521600224 Step 9/21 : COPY pkg/ pkg/ ---> 36d6cef217bf Step 10/21 : RUN CGO_ENABLED=0 GOOS=linux GOARCH=amd64 go build -a -o manager main.go ---> Running in 40a7912846cd Removing intermediate container 40a7912846cd ---> e5006d63a6e4 Step 11/21 : FROM openjdk:8-alpine ---> a3562aa0b991 Step 12/21 : RUN apk add --no-cache bash gettext nmap-ncat openssl busybox-extras ---> Using cache ---> e91627134db4 Step 13/21 : ENV ROCKETMQ_HOME /home/rocketmq ---> Using cache ---> 7888599ef1b7 Step 14/21 : ENV ROCKETMQ_VERSION 4.5.0 ---> Using cache ---> b6e8b072f9f2 Step 15/21 : WORKDIR ${ROCKETMQ_HOME} ---> Using cache ---> 6ad867a456ac Step 16/21 : RUN set -eux; apk add --virtual .build-deps curl gnupg unzip; curl https://archive.apache.org/dist/rocketmq/${ROCKETMQ_VERSION}/rocketmq-all-${ROCKETMQ_VERSION}-bin-release.zip -o rocketmq.zip; curl https://archive.apache.org/dist/rocketmq/${ROCKETMQ_VERSION}/rocketmq-all-${ROCKETMQ_VERSION}-bin-release.zip.asc -o rocketmq.zip.asc; curl -L https://www.apache.org/dist/rocketmq/KEYS -o KEYS; gpg --import KEYS; gpg --batch --verify rocketmq.zip.asc rocketmq.zip; unzip rocketmq.zip; mv rocketmq-/ . ; chmod a+x ; rmdir rocketmq- ; rm rocketmq.zip; apk del .build-deps ; rm -rf /var/cache/apk/ ; rm -rf /tmp/ ---> Using cache ---> a92bfa1e428a Step 17/21 : RUN chown -R root:0 ${ROCKETMQ_HOME} ---> Using cache ---> cc72a450765d Step 18/21 : WORKDIR / ---> Using cache ---> bfd901eec7a2 Step 19/21 : COPY --from=builder /workspace/manager . ---> 8a53ac46d82c Step 20/21 : USER root:root ---> Running in cc4dee7a8c26 Removing intermediate container cc4dee7a8c26 ---> 7f82babfcc08 Step 21/21 : ENTRYPOINT ["/manager"] ---> Running in 53c0ece4f48f Removing intermediate container 53c0ece4f48f ---> ff0b61f2d71b Successfully built ff0b61f2d71b Successfully tagged apacherocketmq/rocketmq-operator:0.3.0-snapshot
@suuugeee Please check sha of operator image you are using. There is an image on dockerhub, which was built 2 years ago:
So you can give another tag to your own image, make sure the image you've built is on the node where rocketmq operator is running.
CASE 4: Containerd 1.5.10 OS: Alibaba Cloud Linux 3 (Soaring Falcon), Kernel: 5.10.84-10.2.al8.x86_64
Build operator image by docker build
first, use docker save
to export image file, then import this file by ctr image import
command.
[root@iZwz91yvcvq6jsszq3ech9Z deploy]# kubectl describe po rocketmq-operator-645796d4bc-2pn85
Name: rocketmq-operator-645796d4bc-2pn85
Namespace: default
Priority: 0
Node: cn-shenzhen.172.16.0.57/172.16.0.57
Start Time: Tue, 19 Jul 2022 20:46:12 +0800
Labels: name=rocketmq-operator
pod-template-hash=645796d4bc
Annotations: kubernetes.io/psp: ack.privileged
Status: Running
IP: 172.16.0.84
IPs:
IP: 172.16.0.84
Controlled By: ReplicaSet/rocketmq-operator-645796d4bc
Containers:
manager:
Container ID: containerd://37894a4572e365b4698167f2a13741f2057361a5209e8506885bc879f935e826
Image: localhost/apacherocketmq/rocketmq-operator:0.3.0-snapshot
Image ID: sha256:24c933e6d0d914c9cfe128e5afdf0a42bbae8f0201e10c6afc515909bccfa491
Port: <none>
Host Port: <none>
Command:
/manager
Args:
--leader-elect
State: Running
Started: Tue, 19 Jul 2022 20:46:15 +0800
Ready: True
Restart Count: 0
Liveness: http-get http://:8081/healthz delay=15s timeout=1s period=20s #success=1 #failure=3
Readiness: http-get http://:8081/readyz delay=5s timeout=1s period=10s #success=1 #failure=3
Environment:
WATCH_NAMESPACE: default (v1:metadata.namespace)
POD_NAME: rocketmq-operator-645796d4bc-2pn85 (v1:metadata.name)
OPERATOR_NAME: rocketmq-operator
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-csfpm (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
kube-api-access-csfpm:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 10m default-scheduler Successfully assigned default/rocketmq-operator-645796d4bc-2pn85 to cn-shenzhen.172.16.0.57
Normal AllocIPSucceed 10m terway-daemon Alloc IP 172.16.0.84/24
Normal Pulled 10m kubelet Container image "localhost/apacherocketmq/rocketmq-operator:0.3.0-snapshot" already present on machine
Normal Created 10m kubelet Created container manager
Normal Started 10m kubelet Started container manager
I don't know why, "ctr image import" cannot import images. This is a very troublesome problem, but I am busy with other things recently, so this problem can only be put on hold for the time being.
hi,bro。
After I copy the code of master to build and install, the status of "rocketmq-operator" pod is "RunContainerError", and the error prompts "exec: "/manager": stat /manager: no such file or directory: unknown".
I don't know what is causing this, but this is running on centos 8 steam.