Open cpj18234088063 opened 2 years ago
Hi, thanks for your request.
The problem is that likwid-mpirun
does not know the topology of remote nodes. It assumes that the local topology fits for the other remote nodes as well. The under-the-hood likwid-perfctr
(or likwid-pin
) calls are generated on the local node. Moreover, it does resolve the pinning settings like S0:0-9
and uses the actual cpu list for further processing. So your request would be to resolve the cpu list internally but keep the original pinning strings for the under-the-hood calls. Do I get that right?
There might be problems later at evaluation. likwid-mpirun
reads the output files and relies on the calculations it has done before (which node-hwthread pairs belongs to which MPI rank). This result collection has to be made more flexible and all required inputs for evaluation need to be provided by the output files.
Hello, when I use likwid-mpirun, I encounter the following situation: I have two machines(A and B), each machine is equipped with two CPUs, but machine A has 24 cores (12 2) and machine B has 20 cores (10 2). For symmetry, I'm going to run 20 threads on machines A and B respectively. I use [S0:0- 9@S1 : 0-9] to set the pin, but [S0:0- 9@S1 : 0-9] is parsed in likwid-mpirun as: [- C 0,1,2,3,4,5,6,7,8,9,12,13,14,15,16,17,18,19,20,21], which makes machine B output : "CPU 20 / 21 not in domain n". In other words, the parsing of [S0:0- 9@S1 : 0-9] is different in likwid-mpirun and likwid-perfctr. Consider making likwid-mpirun support [S0:0- 9@S1 : 0-9] 。For example: [table.insert(cmd, table.concat(cpuexprs[i], ","))] in likwid-mpirun file.
Thank the author for providing such a good work.