QuarkContainer / Quark

A secure container runtime with CRI/OCI interface
Apache License 2.0
324 stars 49 forks source link

implement /sys/devices/system/cpu #1298

Closed QuarkContainer closed 5 months ago

QuarkContainer commented 5 months ago

After implement /sys/devices/system/cpu, the mpirun still doesn't work.

shrik3 commented 5 months ago

mpirun only detects 1 cpu slot therefore can't run tasks with n>1 (without oversubscribe).

Here is a comparison of native cpu topology and the native one:

native

Machine#0
  Package#0
    L3#0(6144KB)
      L2#0(256KB)
        L1d#0(32KB)
          Core#0
            PU#0
      L2#1(256KB)
        L1d#1(32KB)
          Core#1
            PU#1
      L2#2(256KB)
        L1d#2(32KB)
          Core#2
            PU#2
      L2#3(256KB)
        L1d#3(32KB)
          Core#3
            PU#3
*** 1 package(s)
*** Logical processor 0 has 3 caches totaling 6432KB

quark (cpus=4)

*** Objects at level 0
Index 0: Machine
*** Objects at level 1
Index 0: Package
*** Objects at level 2
Index 0: Core
*** Objects at level 3
Index 0: PU
Index 1: PU
*** Printing overall tree
Machine#0
  Package#0
    Core#0
      PU#0
      PU#1
*** 1 package(s)
*** Logical processor 0 has 0 caches totaling 0KB
  1. permission issues (with quark):

    /project $ ls -lh /sys/devices/system/cpu/
    total 0      
    dr-xr-xr-x    1 root     root           0 May 28 19:10 cpu0
    dr-xr-xr-x    1 root     root           0 May 28 19:10 cpu1
    -r--------    0 root     root           0 May 28 19:10 online
    -r--------    0 root     root           0 May 28 19:10 possible
    -r--------    0 root     root           0 May 28 19:10 present

    hwloc reads /sys/devices/system/cpu/online but has no permission

  2. core_cpus:

/sys/devices/system/cpu/cpu1/topology $ cat core_cpus
000003

(it should be 1, also the format is wrong)
  1. topology: quark reports 2 cpus, with 4 cores per cpu. Native one reports 4 cpus, with 1 core per cpu.
QuarkContainer commented 5 months ago

@shrik3 I miss to add the /sys/devices/system/cpu/cpu%d/topology/thread_siblings. After add it, it works in my side.

So far, I copied all the value of the file from my local host. Let's discuss how to implement if it works in your side.

shrik3 commented 5 months ago

@shrik3 I miss to add the /sys/devices/system/cpu/cpu%d/topology/thread_siblings. After add it, it works in my side.

So far, I copied all the value of the file from my local host. Let's discuss how to implement if it works in your side.

I'll test later.

btw should we make the CPU reserved for the IO thread visible to the userspace? If I understand correctly, no other tasks should be scheduled on that CPU.

QuarkContainer commented 5 months ago

Let's have a meeting to discuss current vcpu thread allocation. Ket me share mire details for you

QuarkContainer commented 5 months ago

@shrik3 I updated the PR to retrieve the cpu information from host system. So the current cpu content should be corrected. For the cpu count, let's discuss later.