Closed gbraad closed 1 year ago
Does it happen every single time? I just tried it on F-38 and not able to reproduce this.
every single time with this image... delete and start will result in the same.
Also does this F-38 is a VM or on baremetal? I want to try same setup to identify the issue.
Any idea what Error in reading or validating configuration: subjectAltNames must not contain localhost, 127.0.0.1
could refer to?
[core@api ~]$ cat /etc/microshift/config.yaml
dns:
# Base domain of the cluster. All managed DNS records will be sub-domains of this base.
baseDomain: crc.testing
network:
clusterNetwork:
# IP range for use by the cluster
#- cidr: 10.42.0.0/16
serviceNetwork:
# IP range for services in the cluster
#- 10.43.0.0/16
# Node ports allowed for services
#serviceNodePortRange: 30000-32767
node:
# If non-empty, use this string to identify the node instead of the hostname
#hostnameOverride: ''
# IP address of the node, passed to the kubelet.
# If not specified, kubelet will use the node's default IP address.
#nodeIP: ''
apiServer:
# The Subject Alternative Names for the external certificates in API server (defaults to hostname -A)
#subjectAltNames: []
debugging:
# Log verbosity ('Normal', 'Debug', 'Trace', 'TraceAll'):
#logLevel: 'Normal'
etcd:
# Memory limit for etcd, in Megabytes: 0 is no limit.
#memoryLimitMB: 0
so that means it does:
# The Subject Alternative Names for the external certificates in API server (defaults to hostname -A)
#subjectAltNames: []
[core@api ~]$ hostname -A
bogon api
I run this on my Thinkcentre Tiny. Reinstalled this machine yesterday. Other VMs on this machine are operational... so just tried to install crc on it. And this happened :-s.
Any idea what Error in reading or validating configuration: subjectAltNames must not contain localhost, 127.0.0.1 could refer to?
As error suggest that AltNames
shouldn't have localhost/127.0.0.1 which we don't have as part of microshift configuration, have you tried rerunning the service after ssh to the VM?
Restarting from the VM would result in the same issue... it can't resolve the hostname
to a valid value. After adding this to /etc/hosts
192.168.130.11 bogon api
it starts
Can you check if crc-dnsmasq
service is running from the VM?
Yes, as I also posted earlier that qmeu-guest-agent
and microshift
failed:
[root@api ~]# systemctl status crc-dnsmasq
● crc-dnsmasq.service - Podman container-59a21a926ac5a3dc9a4ce468813397f136e630ee790c9fb665f02871f1cd48bf.service
Loaded: loaded (/etc/systemd/system/crc-dnsmasq.service; enabled; preset: disabled)
Active: active (running) since Tue 2023-07-18 03:45:22 EDT; 1h 17min ago
[root@api ~]# podman exec -it crc-dnsmasq bash
[root@59a21a926ac5 /]# cat /etc/dnsmasq.conf
user=root
port= 53
bind-interfaces
expand-hosts
log-queries
local=/crc.testing/
domain=crc.testing
address=/apps-crc.testing/192.168.130.11
address=/api.crc.testing/192.168.130.11
address=/api-int.crc.testing/192.168.130.11
address=/crc.crc.testing/192.168.122.147
[root@59a21a926ac5 /]# exit
exit
[root@api ~]# cat /etc/resolv.conf
# Generated by NetworkManager
search crc.testing
nameserver 192.168.130.1
[root@api ~]#
I am not sure then why you need to add api
as part of /etc/hosts because following works for me
$ hostname
api.crc.testing
[core@api ~]$ ping $(hostname)
PING api.crc.testing (192.168.130.11) 56(84) bytes of data.
64 bytes from api (192.168.130.11): icmp_seq=1 ttl=64 time=0.086 ms
64 bytes from api (192.168.130.11): icmp_seq=2 ttl=64 time=0.123 ms
64 bytes from api (192.168.130.11): icmp_seq=3 ttl=64 time=0.133 ms
^C
[root@api ~]# dig api @192.168.130.1
; <<>> DiG 9.16.23-RH <<>> api @192.168.130.1
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 39304
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
;; QUESTION SECTION:
;api. IN A
;; ANSWER SECTION:
api. 0 IN A 192.168.130.11
;; Query time: 2 msec
;; SERVER: 192.168.130.1#53(192.168.130.1)
;; WHEN: Tue Jul 18 05:06:54 EDT 2023
;; MSG SIZE rcvd: 48
[root@api ~]#
Maybe it is not API, but bogon
:
[root@api ~]# dig bogon @192.168.130.1
; <<>> DiG 9.16.23-RH <<>> bogon @192.168.130.1
;; global options: +cmd
;; connection timed out; no servers could be reached
[root@api ~]# hostname
api.crc.testing
[root@api ~]#
because hostname -A
returns
[root@api ~]# hostname -A
bogon bogon api.crc.testing api.crc.testing
For you? `
[core@api ~]$ hostname -A
api.crc.testing api api.crc.testing api.crc.testing
From where this bogon
comes ?
So looks like bogon
is for bogus IP address https://apple.stackexchange.com/a/394640 but I am not sure how it is showing for you.
Yep... so something in the bring up fails to work properly.
Did this again just now, and started:
INFO Creating CRC VM for MicroShift 4.13.3...
INFO Generating new SSH key pair...
INFO Starting CRC VM for microshift 4.13.3...
INFO CRC instance is running with IP 192.168.130.11
INFO CRC VM is running
INFO Updating authorized keys...
INFO Configuring shared directories
INFO Check internal and public DNS query...
INFO Check DNS query from host...
INFO Starting Microshift service... [takes around 1min]
INFO Waiting for kube-apiserver availability... [takes around 2min]
INFO Adding microshift context to kubeconfig...
Started the MicroShift cluster.
Use the 'oc' command line interface:
$ eval $(crc oc-env)
$ oc COMMAND
Note: this went wrong 5 times in a row...
This might be issue on your network side configuration, first time I am seeing bogon
issue and I am not sure how to fix it on our end (crc/snc side).
It also seems the VM looses connectivity with podman over time. It was working with crc podman-env
, but after a while it just stops responding.
I think it is related to something with Podman inside the VM:
[core@api ~]$ podman ps
ERRO[0000] invalid internal status, try resetting the pause process with "podman system migrate": could not find any running process: no such process
which might explain why the name does not resolve.
It shows really weird behaviour at times. Redoing a crc podman-env
makes it work again, but after a while it stops again.
Seems the containers I start get stopped:
$ podman start tailscale
tailscale
$ podman ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
fb0bb4b683d7 ghcr.io/spotsnel/tailscale-systemd/fedora:latest 37 minutes ago Up 3 seconds tailscale
$ podman ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
fb0bb4b683d7 ghcr.io/spotsnel/tailscale-systemd/fedora:latest 50 minutes ago Created tailscale
$ podman start tailscale
tailscale
$ podman ps
... and now just hangs again
These containers run any other environments for days/months without an issue, but inside the VM they get stopped. Podman becomes unresponsive. After a while it 'works' again, but the containers all have stopped.
$ eval $(./crc podman-env)
$ podman ps
Cannot connect to Podman. Please verify your connection to the Linux system using `podman system connection list`, or try `podman machine init` and `podman machine start` to manage a new Linux VM
Error: unable to connect to Podman socket: Get "http://d/v4.5.1/libpod/_ping": ssh: rejected: connect failed (open failed)
$ podman ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
This might be issue on your network side configuration,
The VM does not lose connectivity at all. I do a ping (and flood) test and there is no loss.
Must have been a glitch... Startups work now. It is Podman that is failing badly now. Containers that use systemd
do not remain active.
$ podman ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
fb0bb4b683d7 ghcr.io/spotsnel/tailscale-systemd/fedora:latest 2 hours ago Up 5 minutes tailscale
$ podman ps -a
Cannot connect to Podman. Please verify your connection to the Linux system using `podman system connection list`, or try `podman machine init` and `podman machine start` to manage a new Linux VM
Error: unable to connect to Podman socket: Get "http://d/v4.5.1/libpod/_ping": ssh: rejected: connect failed (open failed)
$ podman ps -a
Cannot connect to Podman. Please verify your connection to the Linux system using `podman system connection list`, or try `podman machine init` and `podman machine start` to manage a new Linux VM
Error: unable to connect to Podman socket: Get "http://d/v4.5.1/libpod/_ping": ssh: rejected: connect failed (open failed)
$ ssh -i ~/.crc/machines/crc/id_ecdsa core@192.168.130.11
Script '01_update_platforms_check.sh' FAILURE (exit code '1'). Continuing...
Boot Status is GREEN - Health Check SUCCESS
Last login: Tue Jul 18 07:40:25 2023 from 192.168.130.1
[core@api ~]$ podman ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
[core@api ~]$ podman ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
fb0bb4b683d7 ghcr.io/spotsnel/tailscale-systemd/fedora:latest 2 hours ago Stopping tailscale
[core@api ~]$
Are the containers killed because of linger
? podman-env
uses an ssh connection to start/stop the containers, which most likely acts different when a process/container is started over the socket.
For as long as the container is Stopping
it is not possible to do anything externally.
Network must have been some weird glitch. Filing a new issue for the lingering, as it seems this is the case
This is the current latest version of CRC installed on a clean install of Fedora38. When starting CRC / Microshift.
... this continues until the restart counter reaches
5
times.