Closed longmaosen closed 3 years ago
whats the hardware you are using?
PC1 WORKER: cpu AMD EPYC 7H12; 2048GiB mem lotus version 1.2.1
I have same problem, It seem that PC1 worker is not effective for taskset specified CPU!
I have the same problem. Epyc 7F32 When only 1 process is executed, it falls into 0~3 and progresses as multi-core, but even when 6 processes are executed, all processes go to 0~3, and the same phenomenon occurs even if taskset is assigned.
EPYC 7272 512Gib Ram Just started to experiment with multicore yesterday. I see a 20% drop in time from a little over 5 hours to 4 hours flat, for up to two PC1 tasks on the same worker. If I add another worker and add a PC1 task the system slows down.
All workers were started using taskset to specify cpu affinity, PID 8080 uses three cores 0-2 which it was not set to use, causing a slow down.
Looking at hwloc-ps p showed this
7059 PU:12 PU:13 lotus-worker //this is my add piece worker 7100 PU:0 PU:1 PU:2 PU:3 PU:4 PU:5 lotus-worker //PC1 worker #1 7144 PU:14 PU:15 lotus-worker //PC2 worker 8080 PU:0 PU:1 PU:2 PU:6 PU:7 PU:8 PU:9 PU:10 PU:11 lotus-worker //PC1 worker #2
So running two PC1 tasks on the first worker runs smoothly, when a task is added to the second worker it slows down as it then tries to use cores already being used, 0-2, which should not even be assigned to that PID.
lotus version Daemon: 1.4.0+git.e9989d0e4+api1.0.0 Local: lotus version 1.4.0+git.e9989d0e4
Oops, seems like we needed more information for this issue, please comment with more details or this issue will be closed in 24 hours.
This issue was closed because it is missing author input.
I got 3 pc1 workers on same machine, and i set FIL_PROOFS_USE_MULTICORE_SDR=1, taskset -c 0,1,2,3 lotus-worker run,taskset -c 4,5,6,7 lotus-worker run,taskset -c 8,9,10,11 lotus-worker run,i excpect every worker takes its cpuset(work1:0,1,2,3 ;work2:4,5,6,7;work3:8,9,10,11) and every worker seals 1 layer in 20 minutes, but its result is that three worker's core all bind to core_group 0(cpuset 0,1,2,3), average layer takes 40 minutes.
2020-11-29T23:28:10.477 INFO storage_proofs_porep::stacked::vanilla::proof > replicate_phase1 2020-11-29T23:28:10.477 INFO storage_proofs_porep::stacked::vanilla::graph > using parent_cache[2048 / 1073741824] 2020-11-29T23:28:10.477 INFO storage_proofs_porep::stacked::vanilla::cache > parent cache: opening /data/cpfs/PROOFS_PARENT/v28-sdr-parent-21981246c370f9d76c7a77ab273d94bde0ceb4e938292334960bce05585dc117.cache, verify enabled: false 2020-11-29T23:28:10.477 INFO storage_proofs_porep::stacked::vanilla::proof > multi core replication 2020-11-29T23:28:10.477 INFO storage_proofs_porep::stacked::vanilla::create_label::multi > create labels 2020-11-29T23:28:10.542 DEBUG storage_proofs_porep::stacked::vanilla::cores > Cores: 128, Shared Caches: 32, cores per cache (group_size): 4 2020-11-29T23:28:10.542 DEBUG storage_proofs_porep::stacked::vanilla::cores > checked out core group 0 2020-11-29T23:28:10.542 DEBUG storage_proofs_porep::stacked::vanilla::create_label::multi > binding core in main thread 2020-11-29T23:28:10.542 DEBUG storage_proofs_porep::stacked::vanilla::cores > allowed cpuset: 0 2020-11-29T23:28:10.542 DEBUG storage_proofs_porep::stacked::vanilla::cores > binding to 0 2020-11-29T23:28:10.559 INFO storage_proofs_porep::stacked::vanilla::memory_handling > initializing cache
2020-11-29T23:28:57.189 INFO storage_proofs_porep::stacked::vanilla::create_label::multi > Layer 1 2020-11-29T23:28:57.190 INFO storage_proofs_porep::stacked::vanilla::create_label::multi > Creating labels for layer 1