Open venera70 opened 2 weeks ago
Tracking back the code we have the following call chain:
sidebar.html
: We are calling the url
macro with the arg frontend:rescan
macro.urls.py
(frontend): We land in the machine/rescan
path which is handled by views.rescan
views.py
(frontend): We land in the method rescan
which calls Machine.scan()
machine.py
(data): This adds a MachineCheck
to the taskmanager queue and additionally (in case the arg is all
) the Ansible
task to the queue.From there on the asynchronous task queue (which is broken) picks up both of the tasks and executes them.
Hi @SchoolGuy,
Thanks so much for the explanation.
Actually, running the RescanAll does trigger the Ansible playbook, just that it uses 'root' to login to our ubuntu bastion machine, which gets blocked.
[2024-10-25 18:58:07,445][models][WW]: Couldn't fetch location information for enclosure 'testbastion': <urlopen error [Errno -2] Name or service not known>
[2024-10-25 18:58:08,613][tasks][DD]: Thread [40e31583] MachineCheck:[["testbastion.example.com", 99], {}] started...
[2024-10-25 18:58:09,615][tasks][DD]: Thread [88a161d1] Ansible:[[["testbastion.example.com"]], {}] started...
[2024-10-25 18:58:09,619][tasks][DD]: Calling: /usr/bin/ansible-playbook -i /usr/lib/orthos2/ansible/inventory.yml /usr/lib/orthos2/ansible/site.yml - 127
[2024-10-25 18:58:09,619][tasks][DD]: ansible: - /bin/sh: /usr/bin/ansible-playbook: No such file or directory
- 127
[2024-10-25 18:58:09,619][tasks][WW]: Cannot scan machines ['testbastion.example.com'] via ansible, missing json file in /run/orthos2/ansible
[2024-10-25 18:58:11,622][tasks][DD]: Thread [88a161d1] Ansible exited
[2024-10-25 18:58:15,042][utils][WW]: SSH login failed for testbastion.example.com
[2024-10-25 18:58:15,629][tasks][DD]: Thread [40e31583] MachineCheck exited
[2024-10-25 19:09:49,077][models][WW]: Couldn't fetch location information for enclosure 'testbastion': <urlopen error [Errno -2] Name or service not known>
[2024-10-25 19:09:51,107][tasks][DD]: Thread [40e31583] MachineCheck:[["testbastion.example.com", 99], {}] started...
[2024-10-25 19:09:52,109][tasks][DD]: Thread [88a161d1] Ansible:[[["testbastion.example.com"]], {}] started...
[2024-10-25 19:09:52,868][tasks][DD]: Calling: /usr/bin/ansible-playbook -i /usr/lib/orthos2/ansible/inventory.yml /usr/lib/orthos2/ansible/site.yml - 4
[2024-10-25 19:09:52,868][tasks][DD]: ansible:
PLAY [all] *********************************************************************
TASK [Gathering Facts] *********************************************************
fatal: [testbastion.example.com]: UNREACHABLE! => {"changed": false, "msg": "Failed to connect to the host via ssh: Warning: Permanently added 'testbastion.example.com' (ED25519) to the list of known hosts.\r\nroot@testbastion.example.com: Permission denied (publickey,keyboard-interactive).", "unreachable": true}
PLAY RECAP *********************************************************************
testbastion.example.com : ok=0 changed=0 unreachable=1 failed=0 skipped=0 rescued=0 ignored=0
- - 4
[2024-10-25 19:09:52,868][tasks][WW]: Cannot scan machines ['testbastion.example.com'] via ansible, missing json file in /run/orthos2/ansible
[2024-10-25 19:09:54,117][tasks][DD]: Thread [88a161d1] Ansible exited
[2024-10-25 19:09:57,142][utils][WW]: SSH login failed for testbastion.example.com
[2024-10-25 19:09:58,124][tasks][DD]: Thread [40e31583] MachineCheck exited
[2024-10-27 11:20:36,756][models][WW]: Couldn't fetch location information for enclosure 'devbox3': <urlopen error [Errno -2] Name or service not known>
[2024-10-27 11:20:48,239][models][WW]: Couldn't fetch location information for enclosure 'devbox3': <urlopen error [Errno -2] Name or service not known>
[2024-10-28 09:01:56,053][models][WW]: Couldn't fetch location information for enclosure 'testbastion': <urlopen error [Errno -2] Name or service not known>
[2024-10-28 09:02:07,727][models][WW]: Couldn't fetch location information for enclosure 'devbox3': <urlopen error [Errno -2] Name or service not known>
However, as for the machine reservation by deadline, we noticed that Orthos does not honour the expiry time; the machine is still listed as reserved by a user even after the deadline has elapsed. Is there any way to fix this?
@venera70 Allowing root login is a configuration issue on the target box. Without that things will get tricky in other places because things like lspci and the likes will not work without root privileges.
Orthos not respecting the end of the reservation is a bug that I see internally at SUSE as well and I am planning on working on it in the new year when I am back at work.
Hi @SchoolGuy,
That would be really great, thanks!
Additionally, could we request that the reservation system be able to specify down to an hour granularity, rather than a day?
We were also testing out the orthos-cli, and we've noticed that reserving a machine via the CLI, it would not accept days less than 7:
./orthos2
Welcome to Orthos2 Hardware Reservation System
(orthos 2.0.0:Me.Myself@example.com) query fqdn, architecture, reserved_until, reserved_by
reserved_by reserved_by_email
(orthos 2.0.0:Me.Myself@example.com) query fqdn, architecture, reserved_until, reserved_by
---------------------------------------------------------------------------------------------------
FQDN Architecture Reserved until Reserved by
---------------------------------------------------------------------------------------------------
testbox-1.machines.example.com i386 2024-10-26 23:59 Me.Myself@example.com
testbox-15.machines.example.com i386 - -
testbox-2.machines.example.com i386 2024-10-31 23:59 Me.Myself@example.com
testbox-3.machines.example.com i386 - -
testbox-4.machines.example.com i386 - -
testbox-6.machines.example.com x86_64 - -
testbox-9.machines.example.com x86_64 - -
---------------------------------------------------------------------------------------------------
(orthos 2.0.0:Me.Myself@example.com) reserve testbox-4.machines.example.com
User for which you want to reserve> Me.Myself@example.com
Reason> testing
Duration (days)> 6
ERROR:
* Reservation date must be in the future (min. 1 day). [Until]
(orthos 2.0.0:Me.Myself@example.com) reserve testbox-4.machines.example.com
User for which you want to reserve> Me.Myself@example.com
Reason> testing
Duration (days)> 7
OK.
(orthos 2.0.0:Me.Myself@example.com) query fqdn, architecture, reserved_until, reserved_by where fqdn =~ testbox-4
--------------------------------------------------------------------------------------------------
FQDN Architecture Reserved until Reserved by
--------------------------------------------------------------------------------------------------
testbox-4.machines.example.com i386 2024-10-31 23:59 Me.Myself@example.com
--------------------------------------------------------------------------------------------------
(orthos 2.0.0:Me.Myself@example.com)
It seems that the value supplied is being subtracted by 6, as we repeated this with Duration = 20 days and the Reserved Until was 14 days in the future.
We are trying to get the Rescan All action going in Orthos2, having already set up key-based SSH auth to the target host.
However looking at the logs
/var/log/orthos2/default.log
, we don't see any messages on the remote execution taking place, even with the logging level at debug.Also, it is a little confusing on whether the SSH Executoris being executed or whether Ansible executor is being used.
We see that there are some shell scripts in the scripts dir that are called by machinecheck.py.
In the docs, there don't seem to be any mention of ansible: https://orthos2.readthedocs.io/en/latest/adminguide/server_configuration.html#ssh-keys-paths
Kindly advise, thanks.