libocca / occa

Portable and vendor neutral framework for parallel programming on heterogeneous platforms.
https://libocca.org
MIT License
382 stars 81 forks source link

Add a function for stream to wait for a tag #746

Closed deukhyun-cha closed 2 months ago

deukhyun-cha commented 3 months ago

This is to implement a feature described in Add support a stream to synchronize for a specific event #512.

deukhyun-cha commented 3 months ago

I would appreciate some help on testing especially for dpcpp and metal backends if possible.

codecov[bot] commented 3 months ago

Codecov Report

Attention: Patch coverage is 0% with 8 lines in your changes are missing coverage. Please review.

Project coverage is 74.94%. Comparing base (39908af) to head (c04baf6).

Additional details and impacted files [![Impacted file tree graph](https://app.codecov.io/gh/libocca/occa/pull/746/graphs/tree.svg?width=650&height=150&src=pr&token=doaG4c84wZ&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=libocca)](https://app.codecov.io/gh/libocca/occa/pull/746?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=libocca) ```diff @@ Coverage Diff @@ ## development #746 +/- ## =============================================== - Coverage 74.97% 74.94% -0.03% =============================================== Files 298 298 Lines 19487 19497 +10 =============================================== + Hits 14610 14613 +3 - Misses 4877 4884 +7 ``` | [Files](https://app.codecov.io/gh/libocca/occa/pull/746?dropdown=coverage&src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=libocca) | Coverage Δ | | |---|---|---| | [src/occa/internal/api/metal/commandQueue.hpp](https://app.codecov.io/gh/libocca/occa/pull/746?src=pr&el=tree&filepath=src%2Focca%2Finternal%2Fapi%2Fmetal%2FcommandQueue.hpp&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=libocca#diff-c3JjL29jY2EvaW50ZXJuYWwvYXBpL21ldGFsL2NvbW1hbmRRdWV1ZS5ocHA=) | `0.00% <ø> (ø)` | | | [src/occa/internal/api/metal/polyfill.cpp](https://app.codecov.io/gh/libocca/occa/pull/746?src=pr&el=tree&filepath=src%2Focca%2Finternal%2Fapi%2Fmetal%2Fpolyfill.cpp&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=libocca#diff-c3JjL29jY2EvaW50ZXJuYWwvYXBpL21ldGFsL3BvbHlmaWxsLmNwcA==) | `0.00% <0.00%> (ø)` | | | [src/occa/internal/modes/serial/stream.cpp](https://app.codecov.io/gh/libocca/occa/pull/746?src=pr&el=tree&filepath=src%2Focca%2Finternal%2Fmodes%2Fserial%2Fstream.cpp&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=libocca#diff-c3JjL29jY2EvaW50ZXJuYWwvbW9kZXMvc2VyaWFsL3N0cmVhbS5jcHA=) | `75.00% <0.00%> (-10.72%)` | :arrow_down: | | [src/c/base.cpp](https://app.codecov.io/gh/libocca/occa/pull/746?src=pr&el=tree&filepath=src%2Fc%2Fbase.cpp&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=libocca#diff-c3JjL2MvYmFzZS5jcHA=) | `84.00% <0.00%> (-1.72%)` | :arrow_down: | | [src/core/base.cpp](https://app.codecov.io/gh/libocca/occa/pull/746?src=pr&el=tree&filepath=src%2Fcore%2Fbase.cpp&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=libocca#diff-c3JjL2NvcmUvYmFzZS5jcHA=) | `71.27% <0.00%> (-1.55%)` | :arrow_down: | | [src/core/stream.cpp](https://app.codecov.io/gh/libocca/occa/pull/746?src=pr&el=tree&filepath=src%2Fcore%2Fstream.cpp&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=libocca#diff-c3JjL2NvcmUvc3RyZWFtLmNwcA==) | `77.04% <0.00%> (-2.62%)` | :arrow_down: | ... and [2 files with indirect coverage changes](https://app.codecov.io/gh/libocca/occa/pull/746/indirect-changes?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=libocca)
deukhyun-cha commented 3 months ago

I would appreciate some help on testing especially for dpcpp and metal backends if possible.

I was able to run the example added here for serial, cuda, dpcpp, and openmp backends.

deukhyun-cha commented 3 months ago

I was going through this change and feeling the interface should be done through the device like other stream related functions. If so, should it also work with current stream on the device (instead of taking the stream as an argument)? I would appreciate advice.

kris-rowe commented 3 months ago

I was going through this change and feeling the interface should be done through the device like other stream related functions. If so, should it also work with current stream on the device (instead of taking the stream as an argument)? I would appreciate advice.

Since it is complete, I would leave the existing implementation in the PR as is.

There is already a waitFor function in device that waits on a given streamTag. Since streamTags are associated with the stream which was active when they were created, the existing implementation of device::waitFor is problematic in the case where the active stream has been changed by the user. To avoid this ambiguity, device::waitFor should be reimplemented later using the function added in this PR to synchronize the current stream. A separate API for waiting on a specific streamTag could be added to that class afterwards, if needed.

deukhyun-cha commented 3 months ago

I was going through this change and feeling the interface should be done through the device like other stream related functions. If so, should it also work with current stream on the device (instead of taking the stream as an argument)? I would appreciate advice.

Since it is complete, I would leave the existing implementation in the PR as is.

There is already a waitFor function in device that waits on a given streamTag. Since streamTags are associated with the stream which was active when they were created, the existing implementation of device::waitFor is problematic in the case where the active stream has been changed by the user. To avoid this ambiguity, device::waitFor should be reimplemented later using the function added in this PR to synchronize the current stream. A separate API for waiting on a specific streamTag could be added to that class afterwards, if needed.

Thanks @kris-rowe, appreciate your comment.

deukhyun-cha commented 3 months ago

I was able to build this for Metal backend and run stream examples (after updating for some fixes) on a Mac. The branch is updated.

deukhyun-cha commented 2 months ago

@kris-rowe just wanted to check if you have any general comments for me to consider on this PR before the detailed review?