ComputationalRadiationPhysics / student_project_python_bindings

The student project investigates the performance and memory handling of Python bindings for CUDA C++ code created with pybind11.
GNU General Public License v3.0
1 stars 0 forks source link

Add hip support to mem_ref #28

Closed SimeonEhrig closed 2 years ago

SimeonEhrig commented 2 years ago

Similar to the define ENABLE_CUDA, there is a define ENABLE_HIP if hip is enabled.

SimeonEhrig commented 2 years ago

ping @afif-ishamsyah

SimeonEhrig commented 2 years ago

I profile the main.py width only the function example_context_copy (example_manual_copy was commented).

I recognized, that are 7 hipMemcpy was applied, but only 5 was expected (2 for the function open_sync and one for the function compute). Can you please find out, how is causing the 2 extra mem copies. In the end of the comment, you find a manual to trace the application with rocprof.

This causes also the idea to add new context manager functions. Can you please rename sync_open to sync_open_rw, create the functions sync_open_r (reads only data) and sync_open_w (writes only data) and used it in the main.py to reduce memory operations.

Using rocprof

  1. comment the function example_manual_copy
  2. build your application
  3. go to the build folder and create the trace with rocprof --hip-trace -o binding_trace.csv python src/main.py

check how many time hipMemcpy was executed

check when hipMemcpy was executed

afif-ishamsyah commented 2 years ago

2 extra hipMemcpy comes from algo.initialize_array function. It create 2 hip_mem object, each containing an array of zeros. 1 hip_mem will be used as input object, the other one is for output object. Creating a hip_mem object automatically run a hipMemcpy. It is on algo.hpp, line 166.