spheenik / vfio-isolate

CPU and memory isolation for VFIO
MIT License
90 stars 8 forks source link

FileNotFoundError: [Errno 2] No such file or directory #8

Open ggeorgo opened 2 years ago

ggeorgo commented 2 years ago

Trying some different configurations that do not work. I so went back to the documentation to test the examples. Whatever I do, I am getting below error.


FileNotFoundError: [Errno 2] No such file or directory

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/bin/vfio-isolate", line 33, in <module>
    sys.exit(load_entry_point('vfio-isolate==0.4.0', 'console_scripts', 'vfio-isolate')())
  File "/usr/lib/python3.9/site-packages/vfio_isolate/cli.py", line 200, in run_cli
    executor.run()
  File "/usr/lib/python3.9/site-packages/vfio_isolate/cli.py", line 191, in run
    for undo in e.action.record_undo(e.params):
  File "/usr/lib/python3.9/site-packages/vfio_isolate/action/cpuset_delete.py", line 30, in record_undo
    cpus=cpu_set.get_cpus(),
  File "/usr/lib/python3.9/site-packages/vfio_isolate/cpuset.py", line 69, in get_cpus
    return self.impl.get_cpus(self)
  File "/usr/lib/python3.9/site-packages/vfio_isolate/cpuset.py", line 232, in get_cpus
    CGroupV2.ensure_cpuset_controller_enabled(cpuset)
  File "/usr/lib/python3.9/site-packages/vfio_isolate/cpuset.py", line 228, in ensure_cpuset_controller_enabled
    CGroupV2.enable_controller(cpuset, "cpuset")
  File "/usr/lib/python3.9/site-packages/vfio_isolate/cpuset.py", line 223, in enable_controller
    f.write(f"{prefix}{controller}")
FileNotFoundError: [Errno 2] No such file or directory

This happens when I execute (as root) any of these: vfio-isolate -u /tmp/undo_description cpuset-create --cpus C1-4 /test.slice vfio-isolate cpuset-delete /test.slice

vfio-isolate version: 0.4.0-1 inxi: CPU: 12-Core AMD Ryzen 9 5900X (-MT MCP-) speed/min/max: 3863/2200/3700 MHz Kernel: 5.15.2-2-MANJARO x86_64 Up: 4h 56m Mem: 40630.4/64439.8 MiB (63.1%) Storage: 5.93 TiB (15.9% used) Procs: 514 Shell: Bash inxi: 3.3.09

spheenik commented 2 years ago

Hmm, my code wants to enable the cpuset controller for the root cgroup. Don't know why it fails. As root: Can you check if the file /sys/fs/cgroup/cgroup.subtree_control is existing, and that you can enable it by doing echo +cpuset > /sys/fs/cgroup/cgroup.subtree_control?

ggeorgo commented 2 years ago

Hi Martin,

Thank you for your prompt reply. The file is there, but I cannot enable it. See attached image.

Kind regards, George

Kind regards, George On 27 Nov 2021, 11:02 +0000, Martin Schrodt @.***>, wrote:

Hmm, my code wants to enable the cpuset controller for the root cgroup. Don't know why it fails. As root: Can you check if the file /sys/fs/cgroup/cgroup.subtree_control is existing, and that you can enable it by doing echo +cpuset > /sys/fs/cgroup/cgroup.subtree_control? — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

spheenik commented 2 years ago

Can't see an image. But you say the file is there but you cannot enable it (the second command fails)? Can you check if the file is writeable?

ggeorgo commented 2 years ago

ls -l -rw-r--r-- 1 root root 0 Nov 29 09:31 /sys/fs/cgroup/cgroup.subtree_control

On Mon, 29 Nov 2021 at 16:00, Martin Schrodt @.***> wrote:

Can't see an image. But you say the file is there but you cannot enable it (the second command fails)? Can you check if the file is writeable?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/spheenik/vfio-isolate/issues/8#issuecomment-981772738, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALVA5R6C33BCIZGZMVOMZQDUOOPS5ANCNFSM5IYNILYQ .

spheenik commented 2 years ago

Can you post the output of

cat /sys/fs/cgroup/cgroup.controllers

Does it have cpuset in it?

ggeorgo commented 2 years ago

cat /sys/fs/cgroup/cgroup.controllers  ✔ cpuset cpu io memory hugetlb pids rdma

On Tue, 30 Nov 2021 at 13:40, Martin Schrodt @.***> wrote:

Can you post the output of

cat /sys/fs/cgroup/cgroup.controllers

Does it have cpuset in it?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/spheenik/vfio-isolate/issues/8#issuecomment-982646286, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALVA5R7B474CRZ6ALHWQQKLUOTH4RANCNFSM5IYNILYQ .

spheenik commented 2 years ago

Do a

cat /sys/fs/cgroup/cgroup.subtree_control

maybe it's already enabled?

ggeorgo commented 2 years ago

cat /sys/fs/cgroup/cgroup.subtree_control  ✔ cpuset cpu io memory pids

however, at this point I have executed the script, since I have a VM running, while before it was shut.

On Tue, 30 Nov 2021 at 15:00, Martin Schrodt @.***> wrote:

Do a

cat /sys/fs/cgroup/cgroup.subtree_control

maybe it's already enabled?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/spheenik/vfio-isolate/issues/8#issuecomment-982716858, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALVA5R4ZN6QSANNXNZCIRLLUOTRHLANCNFSM5IYNILYQ .

spheenik commented 2 years ago

Jep. So it is already enabled. Might be a bug in my code, that tries to enable it when it already is, which then causes this error. Will investigate.

drujd commented 2 years ago

I am having a similar, but not exactly the same issue:

Error starting domain: Hook script execution failed: internal error: Child process (LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/bin /etc/libvirt/hooks/qemu win11 prepare begin -) unexpected exit status 1: FileNotFoundError: [Errno 2] No such file or directory

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/bin/vfio-isolate", line 33, in <module>
    sys.exit(load_entry_point('vfio-isolate==0.4.0', 'console_scripts', 'vfio-isolate')())
  File "/usr/lib/python3.10/site-packages/vfio_isolate/cli.py", line 200, in run_cli
    executor.run()
  File "/usr/lib/python3.10/site-packages/vfio_isolate/cli.py", line 191, in run
    for undo in e.action.record_undo(e.params):
  File "/usr/lib/python3.10/site-packages/vfio_isolate/action/cpuset_modify.py", line 39, in record_undo
    cpus=cpu_set.get_cpus(),
  File "/usr/lib/python3.10/site-packages/vfio_isolate/cpuset.py", line 69, in get_cpus
    return self.impl.get_cpus(self)
  File "/usr/lib/python3.10/site-packages/vfio_isolate/cpuset.py", line 232, in get_cpus
    CGroupV2.ensure_cpuset_controller_enabled(cpuset)
  File "/usr/lib/python3.10/site-packages/vfio_isolate/cpuset.py", line 228, in ensure_cpuset_controller_enabled
    CGroupV2.enable_controller(cpuset, "cpuset")
  File "/usr/lib/python3.10/site-packages/vfio_isolate/cpuset.py", line 222, in enable_controller
    with cpuset.open("cgroup.subtree_control", "w") as f:
FileNotFoundError: [Errno 2] No such file or directory

There is an easy workaround - just echo +cpuset > /sys/fs/cgroup/cgroup.subtree_control (or add it to tmpfiles.d). For some reason, the script is unable to do it itself. Then everything works.

ggeorgo commented 2 years ago

Hi Jan,

Thank you for your email. However, I have tried this workaround earlier and failed with "permission denied", both for current user and admin (sudo).

Kind regards, George

On Thu, 6 Jan 2022 at 21:42, Jan Klos @.***> wrote:

I am having a similar, but not exactly the same issue:

Error starting domain: Hook script execution failed: internal error: Child process (LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/bin /etc/libvirt/hooks/qemu win11 prepare begin -) unexpected exit status 1: FileNotFoundError: [Errno 2] No such file or directory

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/usr/bin/vfio-isolate", line 33, in sys.exit(load_entry_point('vfio-isolate==0.4.0', 'console_scripts', 'vfio-isolate')()) File "/usr/lib/python3.10/site-packages/vfio_isolate/cli.py", line 200, in run_cli executor.run() File "/usr/lib/python3.10/site-packages/vfio_isolate/cli.py", line 191, in run for undo in e.action.record_undo(e.params): File "/usr/lib/python3.10/site-packages/vfio_isolate/action/cpuset_modify.py", line 39, in record_undo cpus=cpu_set.get_cpus(), File "/usr/lib/python3.10/site-packages/vfio_isolate/cpuset.py", line 69, in get_cpus return self.impl.get_cpus(self) File "/usr/lib/python3.10/site-packages/vfio_isolate/cpuset.py", line 232, in get_cpus CGroupV2.ensure_cpuset_controller_enabled(cpuset) File "/usr/lib/python3.10/site-packages/vfio_isolate/cpuset.py", line 228, in ensure_cpuset_controller_enabled CGroupV2.enable_controller(cpuset, "cpuset") File "/usr/lib/python3.10/site-packages/vfio_isolate/cpuset.py", line 222, in enable_controller with cpuset.open("cgroup.subtree_control", "w") as f: FileNotFoundError: [Errno 2] No such file or directory

There is an easy workaround - just echo +cpuset > /sys/fs/cgroup/cgroup.subtree_control (or add it to tmpfiles.d). For some reason, the script is unable to do it itself. Then everything works.

— Reply to this email directly, view it on GitHub https://github.com/spheenik/vfio-isolate/issues/8#issuecomment-1006955464, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALVA5RYJZM4DTYS27ND4XZTUUYEDPANCNFSM5IYNILYQ . You are receiving this because you authored the thread.Message ID: @.***>

drujd commented 2 years ago

That's why I said similar, not the same. If you are unable to set +cpuset on the root cgroup manually as root, the problem is somewhere in your system (configuration).

My problem is caused by the fact that before +cpuset can be added to any modified cgroup.subtree_control, all parents need to have this set as well. The script does not ensure this - it would have to traverse the cgroup hierarchy up to root cgroup and +cpuset for every cgroup in the hierarchy - in top-down order.

Reverting +cgroup should probably also be part of the undo operation, but that's a minor issue.

ggeorgo commented 2 years ago

Hi Jan,

It seems that I was running the command as user, not root. I had to run "su" first and then the command, not "sudo..." Now it worked. Do you need any more information?

On Fri, 7 Jan 2022 at 00:08, Jan Klos @.***> wrote:

That's why I said similar, not the same. If you are unable to set +cpuset on the root cgroup manually as root, the problem is somewhere in your system (configuration).

My problem is caused by the fact that for every modified cgroup, all parents need to have cpuset in their cgroup.subtree_control. The script only enables it for the specific cgroup.

— Reply to this email directly, view it on GitHub https://github.com/spheenik/vfio-isolate/issues/8#issuecomment-1007031636, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALVA5R54A543CRNRUTT2JTTUUYVIHANCNFSM5IYNILYQ . You are receiving this because you authored the thread.Message ID: @.***>

drujd commented 2 years ago

Assuming you are using cgroups2 and a recent version of systemd: Create /etc/systemd/system/dummy.slice file (you'll need root privileges), with the following contents:

[Install]
WantedBy=libvirtd.service

[Slice]
AllowedCPUs=0

Then run (everything as root, yet again) systemctl daemon-reload. To enable on reboots: systemctl enable dummy.slice, to apply right now without restarting: systemctl start dummy.slice.

Adding this dummy slice (and ensuring it runs when libvirtd is started) should ensure that the root cgroup will always have cpuset in cgroup.subtree_control. As long as you are only modifying slices directly under this root group (user/slice & system.slice), the script should work now.

(my previous simple solution with tmpfiles.d failed after VM shutdown as cgroups are reset to their boot/systemd defaults when last VM shutdowns, it seems).

VGrol commented 2 years ago

Jan, thanks for the fix, It solved things on my end as well. Though I'd like to know, does this result in the same log entries for you as well in journalctl?

systemd[1]: user.slice: Failed to set 'cpuset.cpus' attribute on '/user.slice' to '': No space left on device
systemd[1]: user.slice: Failed to set 'cpuset.mems' attribute on '/user.slice' to '': No space left on device
systemd[1]: system.slice: Failed to set 'cpuset.cpus' attribute on '/system.slice' to '': No space left on device
systemd[1]: system.slice: Failed to set 'cpuset.mems' attribute on '/system.slice' to '': No space left on device
Kerobyte commented 2 years ago

I'm not sure if I should post here or a new issue but I also can't get these commands to run. /sys/fs/cgroup/cgroup.subtree_control does not exist as mentioned in https://github.com/spheenik/vfio-isolate/issues/8#issuecomment-980540979 I'm not using systemd.

$ sudo vfio-isolate -u /tmp/undo_description cpuset-create --cpus C1-4 /test.slice
Traceback (most recent call last):
  File "/usr/bin/vfio-isolate", line 33, in <module>
    sys.exit(load_entry_point('vfio-isolate==0.4.0', 'console_scripts', 'vfio-isolate')())
  File "/usr/lib/python3.10/site-packages/vfio_isolate/cli.py", line 200, in run_cli
    executor.run()
  File "/usr/lib/python3.10/site-packages/vfio_isolate/cli.py", line 193, in run
    e.action.execute(e.params)
  File "/usr/lib/python3.10/site-packages/vfio_isolate/action/cpuset_create.py", line 15, in execute
    cpu_set.create()
  File "/usr/lib/python3.10/site-packages/vfio_isolate/cpuset.py", line 54, in create
    self.set_cpus(self.parent().get_cpus())
  File "/usr/lib/python3.10/site-packages/vfio_isolate/cpuset.py", line 69, in get_cpus
    return self.impl.get_cpus(self)
  File "/usr/lib/python3.10/site-packages/vfio_isolate/cpuset.py", line 135, in get_cpus
    with cpuset.open("cpuset.cpus", "r") as f:
  File "/usr/lib/python3.10/site-packages/vfio_isolate/cpuset.py", line 37, in open
    return open(self.__path(file), mode)
FileNotFoundError: [Errno 2] No such file or directory: '/sys/fs/cgroup/openrc/cpuset.cpus'

Edit: I found it here /sys/fs/cgroup/unified/cgroup.subtree_control