neoave / mrack

Multicloud use-case based multihost async provisioner for CIs and testing during development
Apache License 2.0
11 stars 14 forks source link

Issue: Hosts with the same name under different domains are handled incorrectly #235

Open jakub-vavra-cz opened 1 year ago

jakub-vavra-cz commented 1 year ago

When using shortname in the name and two machines under different domains have the same one, mrack mixes information together.

Reproducer:

domains:
  - name: samba.test
    type: samba
    hosts:
      - name: dc
        group: medium
        role: samba
        os: fedora-latest
  - name: ad.test
    type: ad
    hosts:
      - name: dc
        role: ad
        group: ad_root
        netbios: DC
        host_type: 'windows'
        os: win-2022
phases:
  - name: init
    steps:
      - playbook: init/testrunner-dir.yaml
  - name: provision
    steps:
      - playbook: provision/mrack-up.yaml
      - playbook: provision/wait.yaml
  - name: prep
    steps:
      - playbook: prep/redhat-base.yaml
      - playbook: prep/repos.yaml
      - playbook: prep/enable-passwd-ssh.yaml
      - playbook: prep/root-ssh.yaml
  - name: teardown
    steps:
      - playbook: teardown/mrack-destroy.yaml

Log where both hots are accessed with user fedora despite one of them is a windows machine:

2023-02-13 06:21:32,429 mrack.providers.openstack INFO OpenStack Validating host: {
    "config_drive": true,
    "flavor": "ci.standard.medium",
    "group": "medium",
    "image": "idm-Fedora-Cloud-Base-37-latest",
    "key_name": "idm-jenkins",
    "name": "dc",
    "network": "shared_net_8",
    "os": "fedora-latest"
}
2023-02-13 06:21:32,429 mrack.providers.openstack INFO OpenStack [dc] OK
2023-02-13 06:21:32,429 mrack.providers.openstack INFO OpenStack Validating host: {
    "config_drive": true,
    "flavor": "ci.disk.large",
    "group": "ad_root",
    "image": "idm-win-2022-2022-10-06-test",
    "key_name": "idm-jenkins",
    "name": "dc",
    "network": "shared_net_8",
    "os": "win-2022"
}
2023-02-13 06:21:32,429 mrack.providers.openstack INFO OpenStack [dc] OK
2023-02-13 06:21:32,429 mrack.providers.provider INFO OpenStack Host(s) definitions valid
2023-02-13 06:21:32,429 mrack.providers.provider INFO OpenStack Checking available resources
2023-02-13 06:21:32,429 mrack.providers.openstack DEBUG OpenStack Loading nova limits
2023-02-13 06:21:32,512 mrack.providers.openstack INFO OpenStack Required vcpus: 6, used: 569, max: 800
2023-02-13 06:21:32,512 mrack.providers.openstack INFO OpenStack Required ram: 8192, used: 1042432, max: 1638400
2023-02-13 06:21:32,512 mrack.providers.provider INFO OpenStack Resource availability: OK
2023-02-13 06:21:32,512 mrack.providers.provider INFO OpenStack Issuing provisioning of 2 host(s)
2023-02-13 06:21:32,513 mrack.providers.openstack INFO OpenStack [dc] Creating server
2023-02-13 06:21:32,513 mrack.providers.openstack INFO OpenStack [dc] Image meta_compose_id: Fedora-37-20221105.0
OpenStack [dc] Image meta_compose_url: https://kojipkgs.fedoraproject.org/compose/37/latest-Fedora-37/compose/
2023-02-13 06:21:32,513 mrack.providers.openstack INFO OpenStack [dc] Creating server
2023-02-13 06:21:33,299 mrack.providers.provider INFO OpenStack Provisioning issued
2023-02-13 06:21:33,299 mrack.providers.provider INFO OpenStack Waiting for all hosts to be active
2023-02-13 06:21:33,299 mrack.providers.openstack DEBUG OpenStack [dc] ID cb3c63e3-b9c9-4a96-b9d7-01db6962271b: sleeping for 11.6 seconds
2023-02-13 06:21:33,299 mrack.providers.openstack DEBUG OpenStack [dc] ID 25fc8aa5-3241-41a1-86f0-e3295f8981d2: sleeping for 31.0 seconds
2023-02-13 06:21:44,914 mrack.providers.openstack DEBUG OpenStack [dc] ID cb3c63e3-b9c9-4a96-b9d7-01db6962271b: Waiting for host creation
2023-02-13 06:21:45,518 mrack.providers.openstack DEBUG OpenStack [dc] ID cb3c63e3-b9c9-4a96-b9d7-01db6962271b: sleeping for 7.9 seconds
2023-02-13 06:21:53,946 mrack.providers.openstack DEBUG OpenStack [dc] ID cb3c63e3-b9c9-4a96-b9d7-01db6962271b: sleeping for 8.4 seconds
2023-02-13 06:22:02,753 mrack.providers.openstack INFO OpenStack [dc] ID cb3c63e3-b9c9-4a96-b9d7-01db6962271b: host was provisioned in 29.5s
2023-02-13 06:22:02,753 mrack.providers.openstack INFO OpenStack [dc] ID cb3c63e3-b9c9-4a96-b9d7-01db6962271b: host was provisioned in 29.5s
...
2023-02-13 06:25:32,147 mrack.providers.openstack INFO OpenStack [dc] ID 25fc8aa5-3241-41a1-86f0-e3295f8981d2: host was provisioned in 238.8s
2023-02-13 06:25:32,147 mrack.providers.provider INFO OpenStack All hosts reached provisioning final state (ACTIVE or ERROR)
2023-02-13 06:25:32,147 mrack.providers.provider INFO OpenStack Provisioning duration: 0:03:59.634818
2023-02-13 06:25:32,147 mrack.providers.provider DEBUG OpenStack Checking provisioned hosts for errors
2023-02-13 06:25:32,147 mrack.providers.provider DEBUG OpenStack [dc] ID cb3c63e3-b9c9-4a96-b9d7-01db6962271b   STATUS - active
2023-02-13 06:25:32,147 mrack.providers.provider DEBUG OpenStack [dc] ID 25fc8aa5-3241-41a1-86f0-e3295f8981d2   STATUS - active
2023-02-13 06:25:32,148 mrack.providers.provider DEBUG OpenStack [dc] ssh check config: {
    "disabled_providers": [
        "podman"
    ],
    "enabled": true,
    "enabled_providers": [],
    "port": 22,
    "timeout": 10
}
2023-02-13 06:25:32,148 mrack.providers.provider DEBUG OpenStack [dc] ssh check config: {
    "disabled_providers": [
        "podman"
    ],
    "enabled": true,
    "enabled_providers": [],
    "port": 22,
    "timeout": 10
}
2023-02-13 06:25:32,148 mrack.providers.provider INFO OpenStack [dc] Waiting for the port 22 on host 10.0.191.130 to start accepting connections (up to 10 minutes)
2023-02-13 06:25:32,152 mrack.providers.provider INFO OpenStack [dc] Port 22 on host  10.0.191.130 is now open
2023-02-13 06:25:32,152 mrack.utils DEBUG Running: ssh -o 'StrictHostKeyChecking=no' -o 'UserKnownHostsFile=/dev/null' -o 'PasswordAuthentication=no' -i config/id_rsa -l fedora 10.0.191.130 echo mrack
2023-02-13 06:25:32,560 mrack.utils DEBUG stdout: mrack
2023-02-13 06:25:32,560 mrack.utils DEBUG stdout: mrack
2023-02-13 06:25:32,561 mrack.utils DEBUG stderr: Warning: Permanently added '10.0.191.130' (ED25519) to the list of known hosts.
2023-02-13 06:25:32,561 mrack.providers.provider INFO OpenStack [dc] SSH to host '10.0.191.130' successful after 0.4s
2023-02-13 06:25:32,561 mrack.providers.provider INFO OpenStack [dc] Waiting for the port 22 on host 10.0.191.199 to start accepting connections (up to 10 minutes)
2023-02-13 06:25:47,826 mrack.providers.provider INFO OpenStack [dc] Port 22 on host  10.0.191.199 is now open
2023-02-13 06:25:47,826 mrack.utils DEBUG Running: ssh -o 'StrictHostKeyChecking=no' -o 'UserKnownHostsFile=/dev/null' -o 'PasswordAuthentication=no' -i config/id_rsa -l fedora 10.0.191.199 echo mrack
2023-02-13 06:25:47,971 mrack.utils DEBUG stderr: Warning: Permanently added '10.0.191.199' (ED25519) to the list of known hosts.
2023-02-13 06:25:47,971 mrack.utils DEBUG stderr: fedora@10.0.191.199: Permission denied (publickey,password,keyboard-interactive).
2023-02-13 06:25:57,982 mrack.utils DEBUG Running: ssh -o 'StrictHostKeyChecking=no' -o 'UserKnownHostsFile=/dev/null' -o 'PasswordAuthentication=no' -i config/id_rsa -l fedora 10.0.191.199 echo mrack
2023-02-13 06:25:58,126 mrack.utils DEBUG stderr: Warning: Permanently added '10.0.191.199' (ED25519) to the list of known hosts.
2023-02-13 06:25:58,126 mrack.utils DEBUG stderr: fedora@10.0.191.199: Permission denied (publickey,password,keyboard-interactive).
pvoborni commented 1 year ago

Could you highlight the mixing together? I don't see any unexpected behavior in the logs. Logging is using the provided hostname.

For cases where shortnames are the same it is better to use full domain names in domain and FQDN in host's name.

pvoborni commented 1 year ago

Ah, I see. By mixing together you mean that it probably tries the Fedora login credentials for the windows hosts and that it might be caused by the same ID.

pvoborni commented 1 year ago

I almost think that the fix for mrack would be to fail if name for 2 hosts is the same and usage of FQDNs for such situation would be recommended.

Originally it was all designed for FQDNs and full domain names. Thus these cases are not well tested.

pvoborni commented 1 year ago

@Tiboris WDYT? Should mrack allow non-unique hostnames, normalized it into FQDNs for further handling or should it rather fail if hostnames are not unique?

I see some pros and cons with both behaviors.

From user perspective, the normalization makes sense, but I'm not sure how dangerous the change is from size of change/potential of introducing new bugs.

jakub-vavra-cz commented 1 year ago

I think there was something about FQDNs being dropped/not recommended due to some openstack/windows issue.

Having two hosts with the same shortname within two domains is a valid use-case. Mrack should probably build and use internally fqdn if it is not in name already.

The wider issue is inconsistency created between metadata, inventory and multihost config.

Metadata : name: Used as a unique identifier for the machine, with domain used to set the hostname of the machine. domain: hostname: Whatever I put there is passed to multihost but not applied on the machine in openstack as a hostname.

Inventory: name: name from metadata meta_fqdn: built from domain and first part of name meta_hostname (==shortname), Contains first part of name ansible_hostname: gained from reverse dns search

MHC:
"name": name from metadata "hostname": Should contain FQDN according to pytest-multihost doc, not populated when not present in metadata. "shortname": Should contain first part of name and is missing when not added to metadata.
"external_hostname": gathered from dns reverse search

When I do not put hostname in the metadata it is missing from the pytest multihost configuration same applies for shortname.

pvoborni commented 1 year ago

I agree that having two host from different domain with the same short name is a valid use case.

The question here is whether to implement it now or postpone - as it is not a cheap change and there is a workaround - defining FQDNs.

Wrt inconsistencies: that's a different issue than this ticket, so I don't want to dive deep there. But to answer the part is that in general it is recommended to use only name in host sections in job metadata file. Ideally that to be FQDN. Don't use shortname, hostname, external_hostname. The name is used consistently in all. Various outputs adds additional data.

Inventory:

pytest-multihost: