tonyyxliu / CUHKSZ-CSC4005

Project Materials for CUHK(SZ) Course CSC4005: Parallel Programming
MIT License
79 stars 31 forks source link

[Proj 4] About the profiling #59

Closed salixc closed 12 months ago

salixc commented 12 months ago

According to the requirements of the project, I used "perf record" and "nsys" to profile the programs. However, the recording data files of NN are quite large (429MB for sequential and 108MB for OpenACC), which may be inconvenient to submit and download. So, I would like to ask if we should submit them directly or if there are any other methods available. Thanks.

image

Here are the commands I used: perf record -e cpu-cycles,cache-misses,page-faults -g -o ./xxx nsys profile -t cuda,nvtx,osrt,openacc -o ./xxx

tonyyxliu commented 12 months ago

If the profiling file is too large, then it is a better way to submit summarized information with perf stat and nsys. As for the .qdrep file, if you have installed Nsight System, you can include the screenshot in the software for a proof that you have done the profiling.

Please do not directly submit such large files on BB, and thank you for your understanding!