COPRS / reference-system-software

This repository contains informations on the Reference System Software and how to contribute to the project.
Apache License 2.0
6 stars 5 forks source link

[BUG] PODs in "ImagePullBackOff" state if scheduled on ingester node #33

Open svergnaud-csgroup opened 2 years ago

svergnaud-csgroup commented 2 years ago

Describe the bug All pods that are scheduled on the ingester node show a "ImagePullBackOff" state

To Reproduce Steps to reproduce the behavior:

  1. Create a cluster by following the docs

Expected behavior PODs should have a "Running" state

Additional context If I ssh on the ingester node, it seems that the DNS resolution fails (nslookup google.Fr in error)

My ingester node has ip address 172.16.3.74 but if I issue a route command, the gateway is shown as 172.16.0.1

If I edit the netplan file to look like :

network:
  version: 2
  renderer: networkd
  ethernets:
    ens3:
      addresses: [172.16.3.74/17]
      gateway4: 172.16.3.1  <==

Instead of :

network:
  version: 2
  renderer: networkd
  ethernets:
    ens3:
      addresses: [172.16.3.74/17]
      gateway4: 172.16.0.1

and apply it, then the DNS resolution works again, the pods are now in "Running" state

It seems that the regex creating the netplan file does not return a correct gateway value

- name: Set netplan config
  copy:
    dest: /etc/netplan/11-ens3-private.yaml
    content: |
      network:
        version: 2
        renderer: networkd
        ethernets:
          ens3:
            addresses: [{{ ansible_facts.ens3.ipv4.address }}/17]
            gateway4: {{ ansible_facts.ens3.ipv4.network | regex_search('^([0-9]{1,3}.){3}') }}1  <==
  become: true