cms-patatrack / pixeltrack-standalone

Standalone Patatrack pixel tracking
Apache License 2.0
17 stars 35 forks source link

[kokkos] serial backend performance fix #338

Closed markdewing closed 2 years ago

markdewing commented 2 years ago

The Kokkos serial backend is slower than the plain serial code - see #297 for more details.

One cause is addressed here, where the Kokkos backend loops over the entire array (size 1024) rather than the number of clusters (about 10). This loop also sets ok and newclusId to zero. Those get set later and these assignments can be removed.

makortel commented 2 years ago

On Xeon Gold 5220 I'm seeing ~4 % improvement with one kokkos --serial process. I first saw an opposite effect on Cori, but found a mistake in my testing setup. Once I get consistent results there I'll merge this PR.

makortel commented 2 years ago

On Cori I see ~5 % improvement on kokkos --serial when running one process on otherwise empty node kokkos_serial_throughput

1-thread case is now witin 4 % of the serial program.

On a fully loaded socket the improvement is smaller but still clearly visible

fullsocket_kokkos_serial_throughput