NVIDIA / spark-rapids

Spark RAPIDS plugin - accelerate Apache Spark with GPUs
https://nvidia.github.io/spark-rapids
Apache License 2.0
783 stars 228 forks source link

[BUG] pmod for large negative decimal values can overflow. #6336

Open revans2 opened 2 years ago

revans2 commented 2 years ago

Describe the bug pmod when done with decimal values on a precision of 38 can produce incorrect answers. if the numbers are negative and large.

Steps/Code to reproduce bug

a=Decimal('-9417536006095259414705321248.3563971038') b=Decimal('-9899024391274969668960277978.7286916957')

CPU answer is pmod(a, b)=Decimal('-9417536006095259414705321248.3563971038')), GPU answer is pmod(a, b)=Decimal('4812651903448647593711583537.3630406504')),

I think this is because pmod for Spark is defined as

    val r = a % n
    if (r != null && r.compare(Decimal.ZERO) < 0) {(r + n) % n} else r

So if r is negative, then the results is r + n % n. But r + n could overflow. Not sure if we should fall back to the CPU in cases where this is possible, or if there is a good way to fix this.

revans2 commented 2 years ago

Oh I forgot to add that for some reason our DecimalGen does not currently generate negative numbers. I tried to fix this, but ran into this error.

sameerz commented 2 years ago

Fall back to the CPU for the short term.

jlowe commented 2 years ago

We fallback to the CPU for max width decimals in #6398. Lowering priority as the plugin now produces correct values for pmod, albeit via falling back for this corner case.

sameerz commented 1 year ago

Removing from 22.10