fabriziosalmi / proxmox-lxc-autoscale

Automatically scale LXC containers resources on Proxmox hosts
https://fabriziosalmi.github.io/proxmox-lxc-autoscale/
MIT License
100 stars 3 forks source link

Installation fails, status check fails #5

Closed profucius closed 1 month ago

profucius commented 2 months ago

Desktop (please complete the following information):

Describe the bug Something seems to be wrong when I try to install the script. I run:

curl -sSL https://raw.githubusercontent.com/fabriziosalmi/proxmox-lxc-autoscale/main/install.sh | bash

Which results in:

🎨 LXC AutoScale Installer
=============================
Welcome to the LXC AutoScale cleanup and installation script!
=============================

2024-09-26 15:54:17 [INFO] Creating backups...
2024-09-26 15:54:17 [INFO] Backed up /etc/lxc_autoscale/lxc_autoscale.yaml to /etc/lxc_autoscale/lxc_autoscale.yaml_backup_20240926155417
2024-09-26 15:54:17 [INFO] Deleting specified files and folders...
2024-09-26 15:54:17 [INFO] Deleted /etc/lxc_autoscale/lxc_autoscale.yaml
2024-09-26 15:54:17 [INFO] Prompting user for installation choice with a 5-second timeout...
Please choose an installation option:
1) βš™οΈ LXC AutoScale (default)
2) ✨ LXC AutoScale ML (experimental)
You have 5 seconds to choose. If no choice is made, option 1 will be selected automatically.
2024-09-26 15:54:19 [INFO] User selected option 1.
You chose option 1.
2024-09-26 15:54:19 [INFO] Installing LXC AutoScale...
Failed to disable unit: Unit file lxc_autoscale_ml.service does not exist.
Failed to stop lxc_autoscale_ml.service: Unit lxc_autoscale_ml.service not loaded.
2024-09-26 15:54:21 [INFO] βœ… Service LXC AutoScale started successfully!
2024-09-26 15:54:21 [INFO] βœ… Installation process complete!

Then I run:

systemctl status lxc_autoscale.service

I get this response:

Γ— lxc_autoscale.service - LXC AutoScale Daemon
     Loaded: loaded (/etc/systemd/system/lxc_autoscale.service; enabled; preset: enabled)
     Active: failed (Result: exit-code) since Thu 2024-09-26 15:49:58 EDT; 3min 6s ago
   Duration: 68ms
       Docs: https://github.com/fabriziosalmi/proxmox-lxc-autoscale
    Process: 699 ExecStart=/usr/bin/python3 /usr/local/bin/lxc_autoscale/lxc_autoscale.py (code=exited, status=1/FA>
   Main PID: 699 (code=exited, status=1/FAILURE)
        CPU: 41ms

Sep 26 15:49:58 pve systemd[1]: Started lxc_autoscale.service - LXC AutoScale Daemon.
Sep 26 15:49:58 pve python3[699]: Traceback (most recent call last):
Sep 26 15:49:58 pve python3[699]:   File "/usr/local/bin/lxc_autoscale/lxc_autoscale.py", line 2, in <module>
Sep 26 15:49:58 pve python3[699]:     import paramiko
Sep 26 15:49:58 pve python3[699]: ModuleNotFoundError: No module named 'paramiko'
Sep 26 15:49:58 pve systemd[1]: lxc_autoscale.service: Main process exited, code=exited, status=1/FAILURE
Sep 26 15:49:58 pve systemd[1]: lxc_autoscale.service: Failed with result 'exit-code'.

I'm under the assumption that the script handles the creation and installation of all files, so I haven't done any manual file manipulation in attempt to resolve. What should I do next?

fabriziosalmi commented 2 months ago

Thank You to point me out to that, it seems that the issue is caused by:

ModuleNotFoundError: No module named 'paramiko'

Can you try to run pip install paramiko then run agains the installer? It's already included in the requirements.txt then should work.

In any case I will double check this evening and a fix will be released before the end of the weekend :)

profucius commented 2 months ago

Hi thanks for the reply. I have done as you instructed, and this is the log output. Note: I do this from the Proxmox WebUI, under Datacenter>PVE>Shell:

root@pve:~# pip install paramiko
Collecting paramiko
  Downloading paramiko-3.5.0-py3-none-any.whl (227 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 227.1/227.1 kB 3.6 MB/s eta 0:00:00
Collecting bcrypt>=3.2
  Downloading bcrypt-4.2.0-cp39-abi3-manylinux_2_28_x86_64.whl (273 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 273.8/273.8 kB 7.6 MB/s eta 0:00:00
Requirement already satisfied: cryptography>=3.3 in /usr/local/lib/python3.11/dist-packages (from paramiko) (43.0.0)
Collecting pynacl>=1.5
  Downloading PyNaCl-1.5.0-cp36-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (856 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 856.7/856.7 kB 11.9 MB/s eta 0:00:00
Requirement already satisfied: cffi>=1.12 in /usr/local/lib/python3.11/dist-packages (from cryptography>=3.3->paramiko) (1.17.0)
Requirement already satisfied: pycparser in /usr/local/lib/python3.11/dist-packages (from cffi>=1.12->cryptography>=3.3->paramiko) (2.22)
Installing collected packages: bcrypt, pynacl, paramiko
Successfully installed bcrypt-4.2.0 paramiko-3.5.0 pynacl-1.5.0
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv

After this, I run the installer again, and I have the same issue as before.

fabriziosalmi commented 2 months ago

apt install python3-paramiko will fix the issue. The install.sh script has been updated🍻

Traxmaxx commented 1 month ago

@fabriziosalmi the splitting of the ML autoscaler reintroduced this issue again : / Shall I add a PR?

profucius commented 1 month ago

apt install python3-paramiko will fix the issue. The install.sh script has been updated🍻

I ran this and it installed the packages. I ran the updated installer script, and I am still having the same issue:

=============================
Welcome to the LXC AutoScale cleanup and installation script!
=============================

2024-09-29 11:55:04 [INFO] Creating backups...
2024-09-29 11:55:04 [INFO] Backed up /etc/lxc_autoscale/lxc_autoscale.yaml to /etc/lxc_autoscale/lxc_autoscale.yaml_backup_20240929115504
2024-09-29 11:55:04 [INFO] Deleting specified files and folders...
2024-09-29 11:55:04 [INFO] Deleted /etc/lxc_autoscale/lxc_autoscale.yaml
2024-09-29 11:55:04 [INFO] Deleted /var/log/lxc_autoscale.log
2024-09-29 11:55:04 [INFO] Deleted /var/lib/lxc_autoscale/backups
2024-09-29 11:55:04 [INFO] Installing LXC AutoScale...
Failed to disable unit: Unit file lxc_autoscale_ml.service does not exist.
Failed to stop lxc_autoscale_ml.service: Unit lxc_autoscale_ml.service not loaded.
2024-09-29 11:55:06 [INFO] βœ… Service LXC AutoScale started successfully!
2024-09-29 11:55:06 [INFO] βœ… Installation process complete!
fabriziosalmi commented 1 month ago

@fabriziosalmi the splitting of the ML autoscaler reintroduced this issue again : / Shall I add a PR?

as u can understand I need some contributions πŸ‘―

fabriziosalmi commented 1 month ago

apt install python3-paramiko will fix the issue. The install.sh script has been updated🍻

I ran this and it installed the packages. I ran the updated installer script, and I am still having the same issue:

=============================
Welcome to the LXC AutoScale cleanup and installation script!
=============================
2024-09-29 11:55:06 [INFO] βœ… Service LXC AutoScale started successfully!
2024-09-29 11:55:06 [INFO] βœ… Installation process complete!

can you check if is running for real? systemctl status lxc_autoscale

profucius commented 1 month ago

This is my log output from that command. Perhaps it is working after all? If so, then perhaps the script (or the Readme) should let the user know that they should run that command to check that it is working, in the event there is an error message?

root@pve:~# systemctl status lxc_autoscale
● lxc_autoscale.service - LXC AutoScale Daemon
     Loaded: loaded (/etc/systemd/system/lxc_autoscale.service; enabled; preset: enabled)
     Active: active (running) since Sun 2024-09-29 11:55:06 EDT; 2 days ago
       Docs: https://github.com/fabriziosalmi/proxmox-lxc-autoscale
   Main PID: 292529 (python3)
      Tasks: 1 (limit: 47871)
     Memory: 31.8M
        CPU: 1h 33min 1.155s
     CGroup: /system.slice/lxc_autoscale.service
             └─292529 /usr/bin/python3 /usr/local/bin/lxc_autoscale/lxc_autoscale.py

Oct 01 15:36:54 pve python3[292529]: 2024-10-01 15:36:54 - Starting resource allocation process...
Oct 01 15:36:54 pve python3[292529]: 2024-10-01 15:36:54 - Ignoring LXC Containers: set()
Oct 01 15:36:54 pve python3[292529]: 2024-10-01 15:36:54 - Initial resources before adjustments: 4 cores, 35862>
Oct 01 15:36:54 pve python3[292529]: 2024-10-01 15:36:54 - Current resource usage for all containers:
Oct 01 15:36:54 pve python3[292529]: 2024-10-01 15:36:54 - Container 107: CPU usage: 37.36%, Memory usage: 11.1>
Oct 01 15:36:54 pve python3[292529]: 2024-10-01 15:36:54 - Container 103: CPU usage: 37.23%, Memory usage: 32.9>
Oct 01 15:36:54 pve python3[292529]: 2024-10-01 15:36:54 - Container 100: CPU usage: 40.69%, Memory usage: 55.2>
Oct 01 15:36:54 pve python3[292529]: 2024-10-01 15:36:54 - Container 112: CPU usage: 52.66%, Memory usage: 16.9>
Oct 01 15:36:54 pve python3[292529]: 2024-10-01 15:36:54 - Final resources after adjustments: 4 cores, 35862 MB>
Oct 01 15:36:54 pve python3[292529]: 2024-10-01 15:36:54 - Resource allocation process completed. Next run in 6>
fabriziosalmi commented 1 month ago

On error the service unit must fail :) You can check that on the /var/log/lxc_autoscale.log but it seems everything works as expected there :)

profucius commented 1 month ago

Thanks for your help. I recommend making a note on the README.md to bring attention to what we've discovered in this ticket: That the error is a red herring, and the script could be working fine, if the user runs the status command to check.