acts-project / traccc

Demonstrator tracking chain on accelerators
Mozilla Public License 2.0
29 stars 50 forks source link

traccc::sycl::full_chain_algorithm Fixes, main branch (2024.11.12.) #769

Closed krasznaa closed 2 weeks ago

krasznaa commented 2 weeks ago

As reported by @flg, the SYCL throughput applications have been crashing since some time. 😦

With SYCL it is paramount that the underlying ::sycl::queue object would be deleted after all objects have stopped using it. (Most importantly, after all allocated memory has been freed.) So now I made this super explicit.

This reminded me that the clusterization algorithm also manages some memory itself. So that algorithm needs to be very manually deleted as well. 🤔

At the same time did some general maintenance as well, to make the code just a little bit better.

flg commented 2 weeks ago

Hi Attila,

Since you've also done some general maintenance I have a small question/suggestion.

If I understand correctly a local sycl::queue is created in full_chain_algorithm constructor that is never used: https://github.com/acts-project/traccc/blob/29ca228cc30a213f9f283f8bfa211e41c297da96/examples/run/sycl/full_chain_algorithm.sycl#L89

If I'm not wrong it should be removed and more importantly the following should wait on the m_data->m_queue not this local one since m_copy will use m_data->m_queue: https://github.com/acts-project/traccc/blob/29ca228cc30a213f9f283f8bfa211e41c297da96/examples/run/sycl/full_chain_algorithm.sycl#L97

Or is this local queue really used for something?

krasznaa commented 2 weeks ago

The code could indeed be simplified. 🤔

In a separate PR I'll add unit tests that would run some of our example applications. As we should really test it in our CI that the applications work to some basic level at least... 🤔

krasznaa commented 2 weeks ago

Hi Attila,

Since you've also done some general maintenance I have a small question/suggestion.

If I understand correctly a local sycl::queue is created in full_chain_algorithm constructor that is never used:

https://github.com/acts-project/traccc/blob/29ca228cc30a213f9f283f8bfa211e41c297da96/examples/run/sycl/full_chain_algorithm.sycl#L89

If I'm not wrong it should be removed and more importantly the following should wait on the m_data->m_queue not this local one since m_copy will use m_data->m_queue:

https://github.com/acts-project/traccc/blob/29ca228cc30a213f9f283f8bfa211e41c297da96/examples/run/sycl/full_chain_algorithm.sycl#L97

Or is this local queue really used for something?

Yepp, I saw that too. Didn't search the history for how that code happened, but indeed it was doing some silly things. The PR now removes the erroneous lines.

sonarcloud[bot] commented 2 weeks ago

Quality Gate Passed Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud