accel-sim / accel-sim-framework

This is the top-level repository for the Accel-Sim framework.
https://accel-sim.github.io
Other
290 stars 110 forks source link

Add cuda api tracer tool to accel-sim framework #162

Open William-An opened 1 year ago

William-An commented 1 year ago

Add CUDA runtime API tracer. Able to track:

  1. CUDA memory allocation call
    1. Recore pointer allocation sizes
    2. Store device pointers as symbolic names and reference in later calls
  2. CUDA memcpy call
    1. Identify memcpy direction
    2. Identify host and device (symbolic name) pointers
    3. Identify copy size
    4. Dump memcpy data to file
  3. CUDA kernel launch call
    1. Identify kernel name and mangled ptx name
    2. Identify kernel function pointer
    3. Identify grid and block size
    4. Identify shared bytes and stream
    5. Identify arguments and argument sizes
      1. If a device pointer, log the symbolic name
      2. If a constant, log the value
      3. Currently use a naive approach to identify arguments with only basic data types (int, float, double, and their 1D pointers)
        1. Waiting on the next release of NVBit to resolve a bug on retrieving argument size info
        2. NVBit issue: https://github.com/NVlabs/NVBit/issues/80
  4. CUDA memory free call
JRPan commented 1 year ago

One thing I'm going to complain about is identifying data types. Is this the only way? It will instantly fail if an application uses some custom data type.

William-An commented 1 year ago

Yeah that is fair. I guess there are two ways to do this:

  1. Wait on the NVBit tool for them to fix the issue so we get the sizes of each argument.
  2. Or we could ask the user to provide some sort of mapping between data types that go into the kernels and the corresponding sizes.

But ultimately still have to perform some checks on whether the argument is a pointer or not, which could only be confirmed by checking the function signature?

JRPan commented 1 year ago

Can you add an else if block at the end and just asssert(0) in it? So if the data type is not enumerated then it will just break and the user would know where to look at. And add a comment there telling the user to add their own data types if reaching the asssert(0).

William-An commented 4 months ago

For the data type identification, we can use the PTX parser in gpgpu-sim to parse it, though it is somewhat hacky. Update: or we can do this: https://github.com/NVlabs/NVBit/issues/80#issuecomment-2039554395