oneapi-src / level-zero

oneAPI Level Zero Specification Headers and Loader
https://spec.oneapi.com/versions/latest/elements/l0/source/index.html
MIT License
208 stars 90 forks source link

Sysman API is not thread-safe when the handle is same #68

Open ywang82 opened 2 years ago

ywang82 commented 2 years ago

I just want to confirm that this is the expected API behavior: When Sysman API (e.g., zesMemoryGetState) is called with same handle concurrently, they are not thread-safe.

Because almost all Sysman API doc (e.g., zesMemoryGetState Doc) says "The application may call this function from simultaneous threads.", I get an impression that all these functions are thread-safe in all cases. But in fact when calling it from concurrent threads using same handle, I get SIGABT due to memory corruption. So I guess that this "The application may call this function from simultaneous threads." requires different handles as described in the introduction section of LevelZero spec.

eero-t commented 2 years ago

For clarity documentation would need to state e.g. that "the application may call this function simultaneously from multiple threads using same the handle", or explicitly state it being thread-safe, if it is supposed to be so.

Otherwise it would be better if the introduction section in the spec would explicitly state how "application may call this function from simultaneous threads" clauses elsewhere in the documentation are supposed to be interpreted.

If API is supposed to be thread-safe, corruption issue should be filed to the backend which is not working according to spec. In Intel case, level-zero backend is implemented by compute-runtime: https://github.com/intel/compute-runtime/

ywang82 commented 2 years ago

@eero-t - yes, the corruption happened in the backend (i.e., libze_intel_gpu.so). i am just filing this issue for clarifying the API doc on API thread-safety and confirming the right way in which we should use these APIs in multi-threading programs.