IntelPython / dpctl

Python SYCL bindings and SYCL-based Python Array API library
https://intelpython.github.io/dpctl/
Apache License 2.0
97 stars 29 forks source link

Fix for crash reported in gh-1654 #1676

Closed oleksandr-pavlyk closed 1 month ago

oleksandr-pavlyk commented 1 month ago

Closes gh-1654

The reason behind the crash was out of bound access to shared local memory accessor.

This also fixes a crash in dpt.sort on CUDA device for sorting of 256 elements of floating point numbers.

github-actions[bot] commented 1 month ago

Deleted rendered PR docs from intelpython.github.com/dpctl, latest should be updated shortly. :crossed_fingers:

github-actions[bot] commented 1 month ago

Array API standard conformance tests for dpctl=0.17.0dev0=py310h15de555_336 ran successfully. Passed: 888 Failed: 17 Skipped: 91

coveralls commented 1 month ago

Coverage Status

coverage: 87.948%. remained the same when pulling 58dfb5c6fcbcc75b4b311e2294b1f891bc56d397 on fix-for-crash-gh-1654 into 8b42313b5d496fc1cf2fd603139aead8b1924643 on master.

oleksandr-pavlyk commented 1 month ago

The example that was crashing on CUDA device:

import dpctl.tensor as dpt
x = dpt.arange(256, dtype="i4", device="cuda:gpu")
dpt.sort(x, descending=True) # crash