openSUSE / orthos2

Orthos is a machine administration tool.
GNU General Public License v2.0
14 stars 13 forks source link

Do machine checks/tasks execute using Ansible or the SSH executor? #271

Open venera70 opened 2 weeks ago

venera70 commented 2 weeks ago

We are trying to get the Rescan All action going in Orthos2, having already set up key-based SSH auth to the target host.

However looking at the logs /var/log/orthos2/default.log, we don't see any messages on the remote execution taking place, even with the logging level at debug.

Also, it is a little confusing on whether the SSH Executoris being executed or whether Ansible executor is being used.

We see that there are some shell scripts in the scripts dir that are called by machinecheck.py.

In the docs, there don't seem to be any mention of ansible: https://orthos2.readthedocs.io/en/latest/adminguide/server_configuration.html#ssh-keys-paths

Kindly advise, thanks.

SchoolGuy commented 1 week ago

Tracking back the code we have the following call chain:

From there on the asynchronous task queue (which is broken) picks up both of the tasks and executes them.

venera70 commented 4 days ago

Hi @SchoolGuy,

Thanks so much for the explanation.

Actually, running the RescanAll does trigger the Ansible playbook, just that it uses 'root' to login to our ubuntu bastion machine, which gets blocked.

[2024-10-25 18:58:07,445][models][WW]: Couldn't fetch location information for enclosure 'testbastion': <urlopen error [Errno -2] Name or service not known>
[2024-10-25 18:58:08,613][tasks][DD]: Thread [40e31583] MachineCheck:[["testbastion.example.com", 99], {}] started...
[2024-10-25 18:58:09,615][tasks][DD]: Thread [88a161d1] Ansible:[[["testbastion.example.com"]], {}] started...
[2024-10-25 18:58:09,619][tasks][DD]: Calling: /usr/bin/ansible-playbook -i /usr/lib/orthos2/ansible/inventory.yml /usr/lib/orthos2/ansible/site.yml - 127
[2024-10-25 18:58:09,619][tasks][DD]: ansible:  - /bin/sh: /usr/bin/ansible-playbook: No such file or directory
 - 127
[2024-10-25 18:58:09,619][tasks][WW]: Cannot scan machines ['testbastion.example.com'] via ansible, missing json file in /run/orthos2/ansible
[2024-10-25 18:58:11,622][tasks][DD]: Thread [88a161d1] Ansible exited
[2024-10-25 18:58:15,042][utils][WW]: SSH login failed for testbastion.example.com
[2024-10-25 18:58:15,629][tasks][DD]: Thread [40e31583] MachineCheck exited
[2024-10-25 19:09:49,077][models][WW]: Couldn't fetch location information for enclosure 'testbastion': <urlopen error [Errno -2] Name or service not known>
[2024-10-25 19:09:51,107][tasks][DD]: Thread [40e31583] MachineCheck:[["testbastion.example.com", 99], {}] started...
[2024-10-25 19:09:52,109][tasks][DD]: Thread [88a161d1] Ansible:[[["testbastion.example.com"]], {}] started...
[2024-10-25 19:09:52,868][tasks][DD]: Calling: /usr/bin/ansible-playbook -i /usr/lib/orthos2/ansible/inventory.yml /usr/lib/orthos2/ansible/site.yml - 4
[2024-10-25 19:09:52,868][tasks][DD]: ansible: 
PLAY [all] *********************************************************************

TASK [Gathering Facts] *********************************************************
fatal: [testbastion.example.com]: UNREACHABLE! => {"changed": false, "msg": "Failed to connect to the host via ssh: Warning: Permanently added 'testbastion.example.com' (ED25519) to the list of known hosts.\r\nroot@testbastion.example.com: Permission denied (publickey,keyboard-interactive).", "unreachable": true}

PLAY RECAP *********************************************************************
testbastion.example.com  : ok=0    changed=0    unreachable=1    failed=0    skipped=0    rescued=0    ignored=0   

 -  - 4
[2024-10-25 19:09:52,868][tasks][WW]: Cannot scan machines ['testbastion.example.com'] via ansible, missing json file in /run/orthos2/ansible
[2024-10-25 19:09:54,117][tasks][DD]: Thread [88a161d1] Ansible exited
[2024-10-25 19:09:57,142][utils][WW]: SSH login failed for testbastion.example.com
[2024-10-25 19:09:58,124][tasks][DD]: Thread [40e31583] MachineCheck exited
[2024-10-27 11:20:36,756][models][WW]: Couldn't fetch location information for enclosure 'devbox3': <urlopen error [Errno -2] Name or service not known>
[2024-10-27 11:20:48,239][models][WW]: Couldn't fetch location information for enclosure 'devbox3': <urlopen error [Errno -2] Name or service not known>
[2024-10-28 09:01:56,053][models][WW]: Couldn't fetch location information for enclosure 'testbastion': <urlopen error [Errno -2] Name or service not known>
[2024-10-28 09:02:07,727][models][WW]: Couldn't fetch location information for enclosure 'devbox3': <urlopen error [Errno -2] Name or service not known>

However, as for the machine reservation by deadline, we noticed that Orthos does not honour the expiry time; the machine is still listed as reserved by a user even after the deadline has elapsed. Is there any way to fix this?

SchoolGuy commented 2 days ago

@venera70 Allowing root login is a configuration issue on the target box. Without that things will get tricky in other places because things like lspci and the likes will not work without root privileges.

Orthos not respecting the end of the reservation is a bug that I see internally at SUSE as well and I am planning on working on it in the new year when I am back at work.

venera70 commented 1 day ago

Hi @SchoolGuy,

That would be really great, thanks!

Additionally, could we request that the reservation system be able to specify down to an hour granularity, rather than a day?

We were also testing out the orthos-cli, and we've noticed that reserving a machine via the CLI, it would not accept days less than 7:

./orthos2 
Welcome to Orthos2 Hardware Reservation System
(orthos 2.0.0:Me.Myself@example.com) query fqdn, architecture, reserved_until, reserved_by
reserved_by        reserved_by_email  
(orthos 2.0.0:Me.Myself@example.com) query fqdn, architecture, reserved_until, reserved_by
---------------------------------------------------------------------------------------------------
 FQDN                             Architecture  Reserved until            Reserved by              
---------------------------------------------------------------------------------------------------
 testbox-1.machines.example.com   i386           2024-10-26 23:59          Me.Myself@example.com 
 testbox-15.machines.example.com  i386            -                         -                        
 testbox-2.machines.example.com   i386           2024-10-31 23:59          Me.Myself@example.com 
 testbox-3.machines.example.com   i386           -                         -                        
 testbox-4.machines.example.com   i386           -                         -                        
 testbox-6.machines.example.com   x86_64            -                         -                        
 testbox-9.machines.example.com   x86_64           -                         -                        
---------------------------------------------------------------------------------------------------
(orthos 2.0.0:Me.Myself@example.com) reserve testbox-4.machines.example.com
User for which you want to reserve> Me.Myself@example.com
Reason> testing
Duration (days)> 6
ERROR:        
* Reservation date must be in the future (min. 1 day). [Until]
(orthos 2.0.0:Me.Myself@example.com) reserve testbox-4.machines.example.com
User for which you want to reserve> Me.Myself@example.com
Reason> testing
Duration (days)> 7
OK.           
(orthos 2.0.0:Me.Myself@example.com) query fqdn, architecture, reserved_until, reserved_by where fqdn =~ testbox-4
--------------------------------------------------------------------------------------------------
 FQDN                            Architecture  Reserved until            Reserved by              
--------------------------------------------------------------------------------------------------
 testbox-4.machines.example.com  i386           2024-10-31 23:59          Me.Myself@example.com 
--------------------------------------------------------------------------------------------------
(orthos 2.0.0:Me.Myself@example.com)

It seems that the value supplied is being subtracted by 6, as we repeated this with Duration = 20 days and the Reserved Until was 14 days in the future.