Open ghostsea opened 10 months ago
我尝试将BAGUA_DIR 改为perflow路径,运行example中的py脚本后报错:
ython3 communication_pattern_analysis.py /usr1/PerFlow/build/example/comm_pattern_analysis/cg.B.x-64p-20231219-102348/static_data/cg.B.x.pag 0.50user 0.01system 0:00.10elapsed 489%CPU (0avgtext+0avgdata 80264maxresident)k 0inputs+744outputs (15major+17432minor)pagefaults 0swaps original_GOMP_parallel = 0x7fcc94d268a0 SET sampling interval to 3100000 cycles PAPI_add_events(EventSet, (int *)Events, NUM_EVENTS), ErrCode: Component containing event is disabled PAPI_overflow(EventSet, PAPI_TOT_CYC, this->cyc_sample_count, 0, _papi_overflow_handler), ErrCode: Component Index isn't set PAPI_start(EventSet), ErrCode: Component Index isn't set srun: error: s_p_parse_file: unable to status file /etc/slurm-llnl/slurm.conf: No such file or directory, retrying in 1sec up to 60sec 请问这里需要什么操作呢
目前我已经在perflow目录下执行过了cmake并在build中完成了make。但是关于如何执行用例获取分析结果目前没看到有说明。请问一下应该如何操作?
另外,请问一下在builtin的test里面,.sh文件有一个路径GPERF_DIR=/mnt/home/jinyuyang/MY_PROJECT/BaguaTool/build/example/project/graph_perf 请问这个路径是存放什么内容的?
在example的py文件中,有proj_dir = os.environ['BAGUA_DIR'],BAGUA_DIR这个环境变量是保存的什么路径?
你好,感谢关注:
我尝试将BAGUA_DIR 改为perflow路径,运行example中的py脚本后报错:
ython3 communication_pattern_analysis.py /usr1/PerFlow/build/example/comm_pattern_analysis/cg.B.x-64p-20231219-102348/static_data/cg.B.x.pag 0.50user 0.01system 0:00.10elapsed 489%CPU (0avgtext+0avgdata 80264maxresident)k 0inputs+744outputs (15major+17432minor)pagefaults 0swaps original_GOMP_parallel = 0x7fcc94d268a0 SET sampling interval to 3100000 cycles PAPI_add_events(EventSet, (int *)Events, NUM_EVENTS), ErrCode: Component containing event is disabled PAPI_overflow(EventSet, PAPI_TOT_CYC, this->cyc_sample_count, 0, _papi_overflow_handler), ErrCode: Component Index isn't set PAPI_start(EventSet), ErrCode: Component Index isn't set srun: error: s_p_parse_file: unable to status file /etc/slurm-llnl/slurm.conf: No such file or directory, retrying in 1sec up to 60sec 请问这里需要什么操作呢
这里看起来静态分析成功了,但是运行时PAPI event不支持,您用papi_avail检查下可用的event。 需要改下系统参数/proc/sys/kernel/perf_event_paranoid置为0或1,请参考https://ptools-perfapi.eecs.utk.narkive.com/zqC46WJG/make-test-failed-papi-tot-cyc-is-not-available
感谢上面的回答,我已经尝试在PAPI正常的环境继续调试。
但是在example目录下尝试使用python3 pag_validation.py,看起来静态分析应该是正常执行了,但是在动态分析会出现
Abort(1142789) on node 0: Fatal error in internal_Comm_size: Invalid communicator, error stack:
internal_Comm_size(30769): MPI_Comm_size(comm=0x0, size=0x564ce11c9760) failed
internal_Comm_size(30723): Invalid communicator
类似的错误好多组。 最后有
===================================================================================
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= PID 100756 RUNNING AT x
= EXIT CODE: 9
= CLEANING UP REMAINING PROCESSES
= YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
===================================================================================
original_GOMP_parallel = 0x7f62d12508a0
YOUR APPLICATION TERMINATED WITH THE EXIT STRING: Killed (signal 9)
This typically refers to a problem with your application.
Please see the FAQ page for debugging suggestions
由于这里失败,后面动态分析文件夹下面的文件均无生成,也就无法继续。 请问这种问题应该如何处理? 我所用的环境为x64 window平台的wsl下的ubutun虚拟机,版本为20.04。
另外是否方便提供一个仓库内在make之后正确执行一个样例的步骤方法?仓库内的test和example分别是如何工作的。
感谢上面的回答,我已经尝试在PAPI正常的环境继续调试。
但是在example目录下尝试使用python3 pag_validation.py,看起来静态分析应该是正常执行了,但是在动态分析会出现
Abort(1142789) on node 0: Fatal error in internal_Comm_size: Invalid communicator, error stack: internal_Comm_size(30769): MPI_Comm_size(comm=0x0, size=0x564ce11c9760) failed internal_Comm_size(30723): Invalid communicator
类似的错误好多组。 最后有
=================================================================================== = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES = PID 100756 RUNNING AT x = EXIT CODE: 9 = CLEANING UP REMAINING PROCESSES = YOU CAN IGNORE THE BELOW CLEANUP MESSAGES =================================================================================== original_GOMP_parallel = 0x7f62d12508a0 YOUR APPLICATION TERMINATED WITH THE EXIT STRING: Killed (signal 9) This typically refers to a problem with your application. Please see the FAQ page for debugging suggestions
由于这里失败,后面动态分析文件夹下面的文件均无生成,也就无法继续。 请问这种问题应该如何处理? 我所用的环境为x64 window平台的wsl下的ubutun虚拟机,版本为20.04。
另外是否方便提供一个仓库内在make之后正确执行一个样例的步骤方法?仓库内的test和example分别是如何工作的。
目前我已经在perflow目录下执行过了cmake并在build中完成了make。但是关于如何执行用例获取分析结果目前没看到有说明。请问一下应该如何操作?
另外,请问一下在builtin的test里面,.sh文件有一个路径GPERF_DIR=/mnt/home/jinyuyang/MY_PROJECT/BaguaTool/build/example/project/graph_perf 请问这个路径是存放什么内容的?
在example的py文件中,有proj_dir = os.environ['BAGUA_DIR'],BAGUA_DIR这个环境变量是保存的什么路径?