oneapi-src / oneDPL

oneAPI DPC++ Library (oneDPL) https://software.intel.com/content/www/us/en/develop/tools/oneapi/components/dpc-library.html
Apache License 2.0
720 stars 114 forks source link

Fix an error in the oneDPL `__future` implementation #1747

Closed SergeyKopienko closed 3 weeks ago

SergeyKopienko commented 1 month ago

In this PR we fix an error in the oneDPL __future implementation.

Let's take look to current implementation of __future::wait() function :

    wait()
    {
#if !ONEDPL_ALLOW_DEFERRED_WAITING
        __my_event.wait_and_throw();
#endif
    }

This mean that real call of wait_and_throw() will depends on the state of ONEDPL_ALLOW_DEFERRED_WAITING macro (what is this and for that - described at https://oneapi-src.github.io/oneDPL/macros.html).

Now let's take a look to implementation of __future::__wait_and_get_value() function :

    template <typename _T>
    constexpr auto
    __wait_and_get_value(_T& __val)
    {
        wait();
        return __val;
    }

So at first - we waits and then return value. But this mean that real call of wait_and_throw() will depends on the state of ONEDPL_ALLOW_DEFERRED_WAITING macro too and we may return value without real wait_and_throw() call.

julianmi commented 1 month ago

I'm wondering if this macro is still used by anyone, and if we still need / want to support it. It seems like the more appropriate venue for something like this is to use the async API.

I agree with this.

If we want to keep it, I think we should improve the documentation for ONEDPL_ALLOW_DEFERRED_WAITING in https://github.com/oneapi-src/oneDPL/blob/main/documentation/library_guide/macros.rst. It is currently very unspecific especially regarding the need for synchronization and potential data issues: "This macro allows waiting for completion of certain algorithms executed with device policies to be deferred. (Disabled by default.)"

akukanov commented 1 month ago

I'm wondering if this macro is still used by anyone, and if we still need / want to support it. It seems like the more appropriate venue for something like this is to use the async API.

@danhoeflinger, you are right of course. However, the async API is still experimental, and the macro is public so formally needs deprecation.

If we want to keep it, I think we should improve the documentation for ONEDPL_ALLOW_DEFERRED_WAITING in https://github.com/oneapi-src/oneDPL/blob/main/documentation/library_guide/macros.rst. It is currently very unspecific especially regarding the need for synchronization and potential data issues.

@julianmi yes, something might be added along the lines that callers are responsible for waiting for completion on the SYCL queue associated with the execution policy before accessing any data modified by deferred algorithms.

danhoeflinger commented 1 month ago

@danhoeflinger, you are right of course. However, the async API is still experimental, and the macro is public so formally needs deprecation.

Yes, you are right, it would require deprecation. Should we consider adding deprecation for the next release? In my opinion the experimental async API is already a much better supported and better documented version of what this macro provides.

akukanov commented 1 month ago

I think we should first have a plan for async API productization, then talk about deprecation.

I added to this patch some extra explanation of the macro in the documentation.

akukanov commented 4 weeks ago

After investigating a handful of these individual patterns, it makes me question if this is the right strategy. It should be said that we are not adding errors where things were working before. This is already broken when the macro is enabled. However, if we are looking at improving this, we should try to fix these sorts of errors and "traps" for future bugs.

@danhoeflinger @julianmi While this patch might still leave some problems unresolved, I think at this time it's an improvement - first, due to limiting the number of places where waiting can be deferred and making these visible and searchable in the code, and second, due to improving the documentation for the macro. Therefore I would still propose to commit it, maybe with adding some TODOs and/or an issue to revisit later.

danhoeflinger commented 3 weeks ago

@danhoeflinger @julianmi While this patch might still leave some problems unresolved, I think at this time it's an improvement - first, due to limiting the number of places where waiting can be deferred and making these visible and searchable in the code, and second, due to improving the documentation for the macro. Therefore I would still propose to commit it, maybe with adding some TODOs and/or an issue to revisit later.

OK, do we want to try to resolve issues with the individual usages like https://github.com/oneapi-src/oneDPL/pull/1747#discussion_r1710009584 in this PR, or leave those as TODO?
If we want to try to resolve them, I can keep looking in detail through the rest of the patterns.