oneapi-src / distributed-ranges

Distributed ranges is a generalization of C++ ranges for distributed data structures.
46 stars 16 forks source link

non-aligned inclusive_scan fails #121

Open lslusarczyk opened 1 year ago

rscohn2 commented 1 year ago

Disabled in CI with #57

lslusarczyk commented 1 year ago

Enabled in https://github.com/oneapi-src/distributed-ranges/pull/197 Fails on 8952468 commit when run on >1GPU with failures like

Expected equality of these values:
  lv[i]
    Which is: 5545
  o[i]
    Which is: -15727121

To reproduce on ortc run one of

srun -p QZ1B-SPR-4oam-PVC build-icpx/test/gtest/shp/shp-tests
srun -p QZ1J-SPR-PVC-2C build-icpx/test/gtest/shp/shp-tests
BenBrock commented 1 year ago

Thanks for catching this---we did fix the problems with inclusive_scan throwing an exception. However, the non-aligned inclusive_scan tests still fail on PVC due to [GSD-3893]. The non-aligned inclusive_scan tests should remain disabled until GSD-3893 is resolved.

I'm closing this issue now, since the original failures have been fixed with #193.

lslusarczyk commented 1 year ago

IMO we should not close issues is such cases. If our algorithm does not work on platform we aim to support, no matter if root cause is in our code or not, I think bug should be open until we workaround it, or disable not-working code (not only a test) or will be able to get fix from another component and confirm our code works.

We may add some label to such issues - like external-bug.