meffie / molecule-proxmox

Molecule driver for Proxmox VE
MIT License
35 stars 9 forks source link

Help needed on configuration #17

Open bitchecker opened 1 month ago

bitchecker commented 1 month ago

Hi, there is some way to debug what is happening when molecule test command is executed with this driver?

I'm using a simple configuration:

driver:
  name: molecule-proxmox
  options:
     api_host: xxxxxxx
     api_user: root@pam
     api_password: "********"
     node: pve
     ssh_user: root
     ssh_identity_file: /path/to/id_rsa
platforms:
  - name: test01
    template_name: template01
    ciuser: root
    cipassword: root
    ipconfig:
      ipconfig0: 'ip=x.y.z.k/24,gw=x.y.z.1'
    nameservers:
      - 1.1.1.1

but when I try to exec a simple test (using default scenario auto-generated files) nothing happens:

WARNING  Driver molecule-proxmox does not provide a schema.
INFO     default scenario test matrix: dependency, cleanup, destroy, syntax, create, prepare, converge, idempotence, side_effect, verify, cleanup, destroy
INFO     Performing prerun with role_name_check=0...
INFO     Running default > dependency
WARNING  Skipping, missing the requirements file.
WARNING  Skipping, missing the requirements file.
INFO     Running default > cleanup
WARNING  Skipping, cleanup playbook not configured.
INFO     Running default > destroy

PLAY [Destroy] *****************************************************************

TASK [Populate instance config] ************************************************
ok: [localhost]

TASK [Dump instance config] ****************************************************
skipping: [localhost]

PLAY RECAP *********************************************************************
localhost                  : ok=1    changed=0    unreachable=0    failed=0    skipped=1    rescued=0    ignored=0

INFO     Running default > syntax

playbook: molecule/default/converge.yml
INFO     Running default > create

PLAY [Create] ******************************************************************

TASK [Populate instance config dict] *******************************************
skipping: [localhost]

TASK [Convert instance config dict to a list] **********************************
skipping: [localhost]

TASK [Dump instance config] ****************************************************
skipping: [localhost]

PLAY RECAP *********************************************************************
localhost                  : ok=0    changed=0    unreachable=0    failed=0    skipped=3    rescued=0    ignored=0

INFO     Running default > prepare

PLAY [Prepare] *****************************************************************

TASK [Waiting for instance ssh connection.] ************************************

No connections on proxmox server, so no machines are created...

IamLunchbox commented 1 month ago

Are you sure that you are able to reach the spun up machine on the default ssh port? It seems like ansible is not able to reach the machine via ssh on port 22. In your other issue you mentioned using an alternative port. Could this be an issue here?

I would furthermore suggest you run molecule with more verbosity (-v) to find out what molecule is trying to do there. Maybe molecule uses an (for you at least) unexpected ansible.cfg file and the therein used ssh-port to connect.

You can actually override the connection port (and other parameters) in your molecule config:

driver:
  name: molecule-proxmox
  options:
  [...]
platforms:
  [...]
provisioner:
  name: ansible
  config_options:
    ssh-connection:
      host_key_checking: false
bitchecker commented 1 month ago

Hi, thanks for reply!

Nope, is not related to other raised issue, after raising that, I updated connection configs and reach directly the pve server using standard 22 and 8006 ports.

I'll try with -v option asap.

bitchecker commented 1 month ago

Hi @IamLunchbox, I tried with more verbosity and I can see that molecule is waiting to connect to the new VM but no VM are going to start so, I'm pretty sure that something should be configured in create.yml and destroy.yml files, but I can find any documentation on how to compile them with this provider.

IamLunchbox commented 1 month ago

I fear I cant help you more specifically, there could be loads of issues why it doesnt work and imho github issues are not the place to do that. You should probably check through tcpdump and on the vm if network works at all.

I personally skip the prepare step (after the connection works) all together. If you check the prepare.yml you'll see, that you cant set sethostname to no to do that.

Aug 11, 2024 08:54:26 bitchecker @.***>:

Hi @IamLunchbox[https://github.com/IamLunchbox], I tried with more verbosity and I can see that molecule is waiting to connect to the new VM but no VM are going to start so, I'm pretty sure that something should be configured in create.yml and destroy.yml files, but I can find any documentation on how to compile them with this provider.

— Reply to this email directly, view it on GitHub[https://github.com/meffie/molecule-proxmox/issues/17#issuecomment-2282803582], or unsubscribe[https://github.com/notifications/unsubscribe-auth/ANRA34JEJRTWEBRSPMTEMILZQ6CLBAVCNFSM6AAAAABME5GTZKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEOBSHAYDGNJYGI]. You are receiving this because you were mentioned. [Tracking image][https://github.com/notifications/beacon/ANRA34II37VOCCBRW65MPZ3ZQ6CLBA5CNFSM6AAAAABME5GTZKWGG33NNVSW45C7OR4XAZNMJFZXG5LFINXW23LFNZ2KUY3PNVWWK3TUL5UWJTUICDIX4.gif]

bitchecker commented 1 month ago

The big problem is that pve is not starting a new vm from the template...so it seems that nothing happens. I'm asking for prepare.yml and destroy.yml because of that.

IamLunchbox commented 1 month ago

In that case you could manually edit the create.yml which comes with installing the plugin and see if it works then. Maybe the variable interpolation is off somehow - but i dont know why the create step should report a successful creation then.

Can you post your redacted molecule.yml and the execution log of molecule test -v here?

Aug 12, 2024 02:41:52 bitchecker @.***>:

The big problem is that pve is not starting a new vm from the template...so it seems that nothing happens. I'm asking for prepare.yml and destroy.yml because of that.

— Reply to this email directly, view it on GitHub[https://github.com/meffie/molecule-proxmox/issues/17#issuecomment-2283405434], or unsubscribe[https://github.com/notifications/unsubscribe-auth/ANRA34O3EEGFQMYTSMJFZ2TZRBYM5AVCNFSM6AAAAABME5GTZKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEOBTGQYDKNBTGQ]. You are receiving this because you were mentioned. [Tracking image][https://github.com/notifications/beacon/ANRA34ND7SO4VZKZ3KJ6P5DZRBYM5A5CNFSM6AAAAABME5GTZKWGG33NNVSW45C7OR4XAZNMJFZXG5LFINXW23LFNZ2KUY3PNVWWK3TUL5UWJTUIDIAHU.gif]

bitchecker commented 1 month ago

molecule.yml file:

---
driver:
  name: molecule-proxmox
  options:
    api_host: xxxxxxxxx
    api_user: root@pam
    api_password: xxxxxxx
    node: proxmox
    ssh_user: root
    ssh_identity_file: /path/to/ssh/private/key
platforms:
  - name: molecule
    template_name: Alma9-template
    ciuser: root
    cipassword: root
    ipconfig:
      ipconfig0: 'ip=192.168.1.244/24,gw=192.168.1.1'
    nameservers:
      - 1.1.1.1
provisioner:
  name: ansible
  config_options:
    ssh-connection:
      host_key_checking: false
      ansible_ssh_common_args: '-p ssh-port'
  lint:
    name: ansible-lint

molecule -v test output:

PLAY [Create] ******************************************************************

TASK [Populate instance config dict] *******************************************
skipping: [localhost] => {"changed": false, "false_condition": "server.changed | default(false) | bool", "skip_reason": "Conditional result was False"}

TASK [Convert instance config dict to a list] **********************************
skipping: [localhost] => {"changed": false, "false_condition": "server.changed | default(false) | bool", "skip_reason": "Conditional result was False"}

TASK [Dump instance config] ****************************************************
skipping: [localhost] => {"changed": false, "false_condition": "server.changed | default(false) | bool", "skip_reason": "Conditional result was False"}

PLAY RECAP *********************************************************************
localhost                  : ok=0    changed=0    unreachable=0    failed=0    skipped=3    rescued=0    ignored=0

INFO     Running default > prepare
Using /molecule/default/ansible.cfg as config file

PLAY [Prepare] *****************************************************************

TASK [Waiting for instance ssh connection.] ************************************

As suggested, all files are at default status, only molecule.yml is updated with connection parameters.

bitchecker commented 1 month ago

Playing with create.yml and using proxmox_kvm module is it possible to manage the VM creation, also if after that I need to understand how to add them to molecule inventory.

meffie commented 1 month ago

Thanks bitchecker. What did you need to change in create.yml to fix your issue?

bitchecker commented 1 month ago

Hi @meffie, in create.yml I'm adding something like that to manage the VMs creation:

- name: Clone cloud-init template
  community.general.proxmox_kvm:
    api_user: root@pam
    api_password: xxxxxxxxx
    api_host: xxxxxxxxxx
    node: proxmox
    vmid: <template-id>
    newid: <new-id> # without other steps fails, some proxmox_kvm module bug?
    clone: Alma9-template
    name: molecule
    storage: local-zfs
    timeout: 90
    net:
      net0: 'virtio,bridge=vmbr0'

- name: Update VM configuration
  community.general.proxmox_kvm:
    api_user: root@pam
    api_password: xxxxxxxxx
    api_host: xxxxxxxxxx
    node: proxmox
    vmid: <new-id> # without other steps fails, some proxmox_kvm module bug?
    cores: 2
    memory: 4096
    ide:
      ide2: 'local-zfs:cloudinit,media=cdrom' # this is not working!
    ciuser: root
    cipassword: root
    nameservers: 1.1.1.1
    ipconfig:
      ipconfig0: 'ip=<ip-address>/24,gw=<gateway-address>'
    tags: molecule
    update: true

- name: Start VM
  community.general.proxmox_kvm:
    api_user: root@pam
    api_password: xxxxxxxxx
    api_host: xxxxxxxxxx
    node: proxmox
    vmid: <new-id> # without other steps fails, some proxmox_kvm module bug?
    state: started

What is missing is that at the end of these steps, the vm should add into the "molecule" inventory, that is this (default code):

  - name: Create instance config
    when: server.changed | default(false) | bool  # noqa no-handler
    block:
      - name: Populate instance config dict  # noqa jinja
        ansible.builtin.set_fact:
          instance_conf_dict: {}
          # instance': "{{ }}",
          # address': "{{ }}",
          # user': "{{ }}",
          # port': "{{ }}",
          # 'identity_file': "{{ }}", }
        with_items: "{{ server.results }}"
        register: instance_config_dict

      - name: Convert instance config dict to a list
        ansible.builtin.set_fact:
          instance_conf: "{{ instance_config_dict.results | map(attribute='ansible_facts.instance_conf_dict') | list }}"

      - name: Dump instance config
        ansible.builtin.copy:
          content: |
            # Molecule managed

            {{ instance_conf | to_json | from_json | to_yaml }}
          dest: "{{ molecule_instance_config }}"
          mode: "0600"

And I'm still trying to understand how to attach new code with default one.

In destroy.yml instead, I added this:

- name: Stop VM
  community.general.proxmox_kvm:
    api_user: root@pam
    api_password: xxxxxxxx
    api_host: xxxxxxx
    node: proxmox
    vmid: 9000
    state: absent
    force: true

As I said in the "code comments" the Cloud-Init directives are failing and no logs are provided.

bitchecker commented 2 weeks ago

@meffie , @IamLunchbox any news on this?

IamLunchbox commented 2 weeks ago

Hi, I am sorry I was on vacation for the past weeks and had no time to check back into this issue.

If there is a problem with either molecule or the proxmox_kvm module, we'd need to know their respective version. Could you provide the output of molecule --version', the molecule-proxmox version (pip freezeideally) and the output ofansible-galaxy info community.general`? And what is your proxmox version?

I was not able to reproduce your problem yet and I don't have an immediate idea, why your state check fails.

In addition, I am suspicoius as to why you HAVE to assign a vmid for the new vm. This seems unusual.

Aug 24, 2024 17:32:10 bitchecker @.***>:

@meffie[https://github.com/meffie] , @IamLunchbox[https://github.com/IamLunchbox] any news on this?

— Reply to this email directly, view it on GitHub[https://github.com/meffie/molecule-proxmox/issues/17#issuecomment-2308431878], or unsubscribe[https://github.com/notifications/unsubscribe-auth/ANRA34L2PO6UHHHQ5ZRL3NLZTCRPRAVCNFSM6AAAAABME5GTZKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMBYGQZTCOBXHA]. You are receiving this because you were mentioned. [Tracking image][https://github.com/notifications/beacon/ANRA34J5VNWPXQJ6NHVYZXDZTCRPRA5CNFSM6AAAAABME5GTZKWGG33NNVSW45C7OR4XAZNMJFZXG5LFINXW23LFNZ2KUY3PNVWWK3TUL5UWJTUJS7QAM.gif]

bitchecker commented 2 weeks ago

for molecule:

molecule==24.8.0
molecule-proxmox==1.0.0

for galaxy community.general should not be necessary because everything is already available, but:

community.general:9.3.0

proxmox: 8.2.4

IamLunchbox commented 2 weeks ago

I tried to reproduce your problem with a clean virtualenv, your specific versions and only the packages molecule-proxmox, molecule, ansible, requests. I was not able to reproduce your issue.

I suspect the main issue may be, that molecule does not seem to pick up the molecule-proxmox create.yml-playbook. The conditional check when: server.changed is not reached my execution.

But there was a similar issue in my testrun, which could be related. In one execution, Molecule somehow did not pickup the correct path for the molecule-proxmox library and did try to use and old locally installed molecule-version. This was only fixed after closing the shell and reactivating the virtualenv afterwards. But then it was working reliably.

My successful execution logged: 1 plays in /home/ubuntu/Repos/testrepo/venv/lib/python3.10/site-packages/molecule_proxmox/playbooks/create.yml.

My failed execution logged: 1 plays in /home/ubuntu/.local/lib/python3.10/site-packages/molecule_proxmox/playbooks/create.yml - and it failed because that version did not have the pool-parameter yet (molecule-proxmox 0.9).

If you not did so before, check with more verbosity (-vv) what paths molecule identified for the given create and destroy playbook and use a clean virtualenv for your tests.

Lastly, maybe try to remove all extra config items from your molecule config.Just to make sure a misplaced dict item is not throwing something off - even though I don't expect this to be the case. When you use the maximum verbosity (-vvvv) you'll see the config items molecule created and used, e.g. to pass them on to proxmox_kvm - maybe this will give you a hint on whats going on.

bitchecker commented 2 weeks ago

Hi, for what I can see, the create.yml is completely skipped (with default file):

PLAYBOOK: create.yml ***********************************************************
Positional arguments: /home/bitchecker/molecule/molecule/default/create.yml
verbosity: 4
connection: ssh
become_method: sudo
tags: ('all',)
skip_tags: ('notest', 'molecule-notest')
inventory: ('/dev/shm/bitchecker/molecule/molecule/default/inventory',)
forks: 50

PLAY [Create] ******************************************************************

TASK [Populate instance config dict] *******************************************
task path: /home/bitchecker/molecule/molecule/default/create.yml:13
skipping: [localhost] => {
    "changed": false,
    "false_condition": "server.changed | default(false) | bool",
    "skip_reason": "Conditional result was False"
}

TASK [Convert instance config dict to a list] **********************************
task path: /home/bitchecker/molecule/molecule/default/create.yml:24
skipping: [localhost] => {
    "changed": false,
    "false_condition": "server.changed | default(false) | bool",
    "skip_reason": "Conditional result was False"
}

TASK [Dump instance config] ****************************************************
task path: /home/bitchecker/molecule/molecule/default/create.yml:28
skipping: [localhost] => {
    "changed": false,
    "false_condition": "server.changed | default(false) | bool",
    "skip_reason": "Conditional result was False"
}

PLAY RECAP *********************************************************************
localhost                  : ok=0    changed=0    unreachable=0    failed=0    skipped=3    rescued=0    ignored=0

So, the main issue starts from here.

For converge.yml he tries to connect to the name that you give to the platform in molecule.yml so I can see in my logs:

sending connection check: [b'ssh', b'-vvvv', b'-C', b'-o', b'ControlMaster=auto', b'-o', b'ControlPersist=60s', b'-o', b'StrictHostKeyChecking=no', b'-o', b'KbdInteractiveAuthentication=no', b'-o', b'PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey', b'-o', b'PasswordAuthentication=no', b'-o', b'ConnectTimeout=10', b'-o', b'ControlPath="/home/bitchecker/.ansible/cp/%h-%p-%r"', b'-O', b'check', b'molecule']

so, if you're not in the same network of the proxmox server, and you haven't a DNS that can solve that name, you'll never reach that guest (that will never be reached because if you don't update the create.yml file, it will never be created)

IamLunchbox commented 2 weeks ago

Yes it is skipped - because server.changed is undefined. This variable is neither used nor set in molecule-proxmox.

Did you by chance copy the default create.yml from the molecule project into the molecule default scenario? The path you pasted implies, that the create.yml resides in that scenario.

You dont need to define create, prepare and destroy yourself. They are pulled from this project - you only need to set a converge.yml and a molecule.yml in the molecule/default directory.

Aug 25, 2024 17:56:56 bitchecker @.***>:

Hi, for what I can see, the create.yml is completely skipped (with default file):

PLAYBOOK: create.yml *** Positional arguments: /home/bitchecker/molecule/molecule/default/create.yml verbosity: 4 connection: ssh become_method: sudo tags: ('all',) skip_tags: ('notest', 'molecule-notest') inventory: ('/dev/shm/bitchecker/molecule/molecule/default/inventory',) forks: 50

PLAY [Create] **

TASK [Populate instance config dict] *** task path: /home/bitchecker/molecule/molecule/default/create.yml:13 skipping: [localhost] => { "changed": false, "false_condition": "server.changed | default(false) | bool", "skip_reason": "Conditional result was False" }

TASK [Convert instance config dict to a list] ** task path: /home/bitchecker/molecule/molecule/default/create.yml:24 skipping: [localhost] => { "changed": false, "false_condition": "server.changed | default(false) | bool", "skip_reason": "Conditional result was False" }

TASK [Dump instance config] **** task path: /home/bitchecker/molecule/molecule/default/create.yml:28 skipping: [localhost] => { "changed": false, "false_condition": "server.changed | default(false) | bool", "skip_reason": "Conditional result was False" }

PLAY RECAP ***** localhost : ok=0 changed=0 unreachable=0 failed=0 skipped=3 rescued=0 ignored=0

So, the main issue starts from here.

For converge.yml he tries to connect to the name that you give to the platform in molecule.yml so I can see in my logs:

sending connection check: [b'ssh', b'-vvvv', b'-C', b'-o', b'ControlMaster=auto', b'-o', b'ControlPersist=60s', b'-o', b'StrictHostKeyChecking=no', b'-o', b'KbdInteractiveAuthentication=no', b'-o', b'PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey', b'-o', b'PasswordAuthentication=no', b'-o', b'ConnectTimeout=10', b'-o', b'ControlPath="/home/bitchecker/.ansible/cp/%h-%p-%r"', b'-O', b'check', b'molecule']

so, if you're not in the same network of the proxmox server, and you haven't a DNS that can solve that name, you'll never reach that guest (that will never be reached because if you don't update the create.yml file, it will never be created)

— Reply to this email directly, view it on GitHub[https://github.com/meffie/molecule-proxmox/issues/17#issuecomment-2308905044], or unsubscribe[https://github.com/notifications/unsubscribe-auth/ANRA34P336G2GLCBZ7DSLYTZTH5ELAVCNFSM6AAAAABME5GTZKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMBYHEYDKMBUGQ]. You are receiving this because you were mentioned. [Tracking image][https://github.com/notifications/beacon/ANRA34IXLBGNXIUSMY63TQ3ZTH5ELA5CNFSM6AAAAABME5GTZKWGG33NNVSW45C7OR4XAZNMJFZXG5LFINXW23LFNZ2KUY3PNVWWK3TUL5UWJTUJT4MFI.gif]

bitchecker commented 2 weeks ago

When I created the env, I created a new scenario, and I'm using the default create.yml and others. The only files that I changed were converge.yml and molecule.yml.

In on previous test, I also completely deleted all files and kept only those two.

I still think that if you're not on the same PVE network that uses default ports and with guests on the bridge in the same network, all these stuff will never work...so also if your PVE is hosted (with natted guests) you'll be never able to use this.

IamLunchbox commented 2 weeks ago

In on previous test, I also completely deleted all files and kept only those two.

Good to know! I will test if molecule-docker works with the default files generated by molecule. If it does, this will probably stop this project being ever merged into the molecule-plugins repo and we should check, what keeps the default config from being run with molecule-proxmox.

I still think that if you're not on the same PVE network that uses default ports and with guests on the bridge in the same network, all these stuff will never work...

This is incorrect. You don't need to use the same network, if you use this plugin. I always work through VPN without hostname resolution. Molecule saves the ip associated IP-addresses, not only the hostname, into its cache file:

ubuntu@adm: cat /home/ubuntu/.cache/molecule/test/default/inventory/ansible_inventory.yml         
# Molecule managed

---
all:
  hosts:
    test: &id001
      ansible_host: 10.0.0.233
      ansible_port: 22
      ansible_private_key_file: redacted
      ansible_ssh_common_args: -o UserKnownHostsFile=/dev/null -o ControlMaster=auto
        -o ControlPersist=60s -o ForwardX11=no -o LogLevel=ERROR -o IdentitiesOnly=yes
        -o StrictHostKeyChecking=no
      ansible_user: test
      connection: ssh
  vars:
    molecule_ephemeral_directory: '{{ lookup(''env'', ''MOLECULE_EPHEMERAL_DIRECTORY'')
      }}'
    molecule_file: '{{ lookup(''env'', ''MOLECULE_FILE'') }}'
    molecule_instance_config: '{{ lookup(''env'', ''MOLECULE_INSTANCE_CONFIG'') }}'
    molecule_no_log: '{{ lookup(''env'', ''MOLECULE_NO_LOG'') or not molecule_yml.provisioner.log|default(False)
      | bool }}'
    molecule_scenario_directory: '{{ lookup(''env'', ''MOLECULE_SCENARIO_DIRECTORY'')
      }}'
    molecule_yml: '{{ lookup(''file'', molecule_file) | from_yaml }}'
ungrouped:
  hosts:
    test: *id001
  vars: {}

so also if your PVE is hosted (with natted guests) you'll be never able to use this.

I think you can safely assume that nobody tested the case you outlined yet. But if you think using a different port keeps you from doing so with molecule-proxmox 1.0: This is not the case. Currently, use ansible_port to use the given port for ALL platforms of that scenario.

name: ansible
  connection_options:
    ansible_port: 23

And because you said never: If you want to have an alternative port per platform, please create a PR. I am sure michael is open for improvements, especially when the change is within the spec of the proxmox_kvm module.

IamLunchbox commented 2 weeks ago

I rechecked the behaviour with molecule init scenario test and the podman driver. It works if I just provide a molecule.yml and a converge.yml.

After initiating the scenario, which leads to three default files being created (create, prepare, destroy) the same behavior occurs as described by you.

Therefore, I assume that the skipping of instance creation when using the default create.yml does not seem to be a molecule-proxmox specific error.