Open Kyo-Choco opened 1 month ago
Hi,
Currently, the GPU version of Mess only supports integration with PAPI. I’ll do my best to guide you through the necessary steps and checks. I’ll also respond as quickly as possible to any follow-up questions.
Before running the benchmark with PAPI, here are some essential checks:
Here’s a step-by-step guide to get started:
Install CUDA version 12.2.
Install PAPI version 7.1.
Run the command papi_native_avail
.
This command will output a list of all available hardware counters. Look for a section labeled:
===============================================================================
Native Events in Component: cuda
===============================================================================
Under this header, you’ll find the list of hardware counters supported by both your GPU and PAPI. Each GPU has unique names for these counters, so you'll need to review the list and identify counters related to DRAM (main memory) that measure bandwidth.
Once you’ve found the relevant counters, update the counter names in your code. In the main.cu
file, replace the existing counter names with those you identified. For example, if you’re using a K80 GPU, you would replace this code:
const char *EventName[] = {"cuda:::fbpa__dram_read_bytes.sum.per_second:device=0",
"cuda:::fbpa__dram_write_bytes.sum.per_second:device=0",
"cuda:::fbpa__dram_read_bytes.sum:device=0",
"cuda:::fbpa__dram_write_bytes.sum:device=0"};
with the following:
const char *EventName[] = {"cuda:::metric:gld_requested_throughput:device=0",
"cuda:::metric:gst_requested_throughput:device=0",
"cuda:::metric:dram_read_throughput:device=0",
"cuda:::metric:dram_write_throughput:device=0"};
Please let me know if you had any further questions!
Hello, thank you for your amazing work. I'm trying to run
GPU\NVIDIA\H100\src-mn5-h100\submit.bash
script on H100, but I've encountered a problem with the code below in Main.cu:validate(PAPI_event_name_to_code(EventName[i], &events[i]), PAPI_OK);
It returns:-7 PAPI_ENOEVNT Hardware event does not exist
I'm using the 7.1.0 release version that I downloaded and compiled myself from https://icl.utk.edu/projects/papi/. Which is the version used in the project. Since there's only limited information available about PAPI, I'd like to ask if there is any other libraries need to be configured.