oneapi-src / distributed-ranges

Distributed ranges is a generalization of C++ ranges for distributed data structures.
46 stars 16 forks source link

shp benchmarks in borealis failed #593

Open lslusarczyk opened 11 months ago

lslusarczyk commented 11 months ago

https://github.com/intel-sandbox/libraries.runtimes.hpc.dds.dr-ci/actions/runs/6637463788

check it, if newly enabled benchmarks revealed some problem fix it if it is easy or at least comment out appropriate benchmark in shp with comment pointing to this issue

lslusarczyk commented 11 months ago

analysing the failure, currently on Borealis shp-benhc times out, on devcloud my account expired, running locally - out-of-mem (seems shp-bench ignores in some cases vector-size - fixing it...)

in progress...

lslusarczyk commented 10 months ago

ExclusiveScan benchmark in shp fails. See: https://github.com/intel-sandbox/libraries.runtimes.hpc.dds.dr-ci/actions/runs/6703645628

Exact command:

ONEAPI_DEVICE_SELECTOR='level_zero:gpu;ext_oneapi_cuda:gpu' \
KMP_AFFINITY=compact shp/shp-bench --vector-size 2000000000 --reps 50\
 --benchmark_out_format=json --context device:GPU --context model:SHP --context runtime:SYCL\
 --context target:SHP_SYCL_GPU --v=3 --benchmark_out=dr-bench-adc021a8e9a64a6c86da243e79fcb338.json\
 --benchmark_filter=.*Sort_DR\|Gemm_DR\|^DotProduct_DR\|^Exclusive_Scan_DR\|^Inclusive_Scan_DR\|^Reduce_DR --num-devices 6

output with failure:

- LOG(2): Running Exclusive_Scan_DR/min_time:0.100/min_warmup_time:0.100/real_time for 1
free(): invalid next size (fast)