intel / llvm

Intel staging area for llvm.org contribution. Home for Intel LLVM-based projects.
Other
1.21k stars 725 forks source link

Windows urAdapterRelease() occurs before releaseDefaultContext() #14768

Open kbenzie opened 1 month ago

kbenzie commented 1 month ago

Affected tests:

kbenzie commented 1 month ago

This test is checking that the order of object releases happen in a specific order. However, on Windows, the destruction logic is different to Linux. There is not actually any change in destruction order in the PI2UR-main branch but there is a difference between piTearDown() (of an adapter) and urAdapterRelease() where the adapter ref count reaches zero. In the latter case the adapter library is actually unloaded, but with piTearDown() objects apparently still persist.

Here is the destruction trace with PI:

---> DLL_PROCESS_DETACH syclx.dll

---> piTearDown(
        <unknown> : 000002457CEF7D10
) --->  pi_result : PI_SUCCESS

---> piTearDown(
        <unknown> : 000002457CEF70B0
) --->  pi_result : PI_SUCCESS

---> piTearDown(
        <unknown> : 000002457CEF7C80
) --->  pi_result : PI_SUCCESS

P:\intel\llvm\.local\src\sycl\source\detail\global_handler.cpp:245: sycl::_V1::detail::EarlyShutdownHandler::~EarlyShutdownHandler
---> piContextRelease(
        <unknown> : 0000024502D42100
) ---> API Called After Plugin Teardown, Functon Call ignored.
---> piKernelRelease(
        <unknown> : 0000024502DC1330
) ---> API Called After Plugin Teardown, Functon Call ignored.
---> piProgramRelease(
        <unknown> : 0000024502D881E0
) ---> API Called After Plugin Teardown, Functon Call ignored.
---> piDeviceRelease(
        <unknown> : 0000024502D467B0
) ---> API Called After Plugin Teardown, Functon Call ignored.
---> DLL_PROCESS_DETACH pi_level_zero.dll

---> DLL_PROCESS_DETACH pi_opencl.dll

---> DLL_PROCESS_DETACH pi_win_proxy_loader.dll

Note how piTearDown() is called before piContextRelease(), this is not what I would have expected.

This is the UR trace, as you can see when urAdapterRelease() is called that triggers the UR loader to unload those adapter DLL's. After this point, the SYCL RT sets a flag to track that the adapters have been unloaded and will not call into UR. This means the calls the default context does not get released and all the resulting objects, kernel, program, device, etc.

---> DLL_PROCESS_DETACH syclx.dll

---> urAdapterRelease(.hAdapter = 0000020A4C199EC0) -> UR_RESULT_SUCCESS;
---> urAdapterRelease(.hAdapter = 0000020A4C199F20) -> UR_RESULT_SUCCESS;
<LOADER>[INFO]: unloaded adapter 0x00007FFBA2290000
<LOADER>[INFO]: unloaded adapter 0x00007FFBF2070000

I believe this is a problem in the global destruction logic. The default context destruction should be occuring before the piTearDown()/urAdapterRelease(). What I'm not clear about is why the Windows path differs, so I am warey of alinging those as part of https://github.com/intel/llvm/pull/14145.