diku-dk / futhark

:boom::computer::boom: A data-parallel functional programming language
http://futhark-lang.org
ISC License
2.36k stars 165 forks source link

Platform inconsistency in power calculation #2101

Closed Woogachaka closed 5 months ago

Woogachaka commented 5 months ago

This is probably less of a bug and more of a curiosity, but in trying to troubleshoot an issue in one of our projects, my colleagues and I found an inconsistency between the M1 Mac execution and on Ubuntu in WSL.

it can be exposed using the following code:

def linspace (start:f64) (end:f64) (numPoints:i64): [numPoints]f64 = 
    let dx = (end - start) / (f64.i64 (numPoints-1))
    in map (\v -> start + dx*(f64.i64 v)) (0..<numPoints)

let N = 11i64
let p = N*N :> i64

let diameter = 1.0f64
let tol = 1e-16f64
let tol2 = 0.0f64
let x = linspace (-0.5*diameter) (0.5*diameter) N |> replicate N |> flatten :> [p]f64
let y = linspace (-0.5*diameter) (0.5*diameter) N |> replicate N |> transpose |> flatten :> [p]f64
let xyf = unzip (filter(\(xi,yi) -> (xi**2+yi**2)**0.5 <= ((0.5 + tol)*diameter)) (zip x y))
let xyf2 = unzip (filter(\(xi,yi) -> (xi**2+yi**2)**0.5 <= ((0.5 + tol2)*diameter)) (zip x y))

let xyfSQ = unzip (filter(\(xi,yi) -> f64.sqrt (xi**2+yi**2) <= ((0.5 + tol)*diameter)) (zip x y))
let xyf2SQ = unzip (filter(\(xi,yi) -> f64.sqrt (xi**2+yi**2) <= ((0.5 + tol2)*diameter)) (zip x y))

let xf = xyf.0
let yf = xyf.1
let xf2 = xyf2.0
let yf2 = xyf2.1

let xfSQ = xyfSQ.0
let yfSQ = xyfSQ.1
let xf2SQ = xyf2SQ.0
let yf2SQ = xyf2SQ.1

-- on ubuntu the length of xf2 is 77  when all the other variants of XF are of length 81. On an M1 Mac, all variants of XF match at length 81 
athas commented 5 months ago

Are you sure it's a discrepancy in ** and not the other operations (which I guess would have to be the square root)? In any case, Futhark compiles those to pow in the platform C library, which I suppose can vary between platforms.

Woogachaka commented 5 months ago

I would have to assume its in ** as that is the only difference in calculations between xyf2 and xyf2SQ

athas commented 5 months ago

What happens if you say xi*xi+yi*yi instead of xi**2+yi**2, and use f64.sqrt instead of **0.5?

athas commented 5 months ago

I really think it's just a tiny variance in the precision of pow() across different implementation (and I don't know which one is most accurate). Computing

map(\(xi,yi) -> ((xi**2+yi**2)**0.5-((0.5 + tol2)*diameter))) (zip x y)

I see some numbers very close to zero, and it wouldn't take much nudging to push them over the line.

Woogachaka commented 5 months ago

I suspect you're correct in that. When testing your request, I see consistency with the other sets that seem to match. It appears that the failure case (the inconsistency with xf2) is an accumulated error/inconsistency in pow() in ubuntu 22.04

(venv) zerbeba1@TALOS:~/futhark-tracer/examples$ gcc --version
gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0
Copyright (C) 2021 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
athas commented 5 months ago

I get the same result on NixOS, so it's probably just how glibc is implemented. I also get that result in the Futhark interpreter, which uses Haskell's pow (but which is probably implemented via the C library). macOS uses a C library developed as part of the LLVM project. I have no reason to know which is more accurate.

Anyway, Futhark intentionally doesn't implement any of these numerical primitives itself, but just dispatches to whatever the platform does. That is overall a good idea, but sometimes this leads to small inconsistencies. You'll see even larger differences once you start running this on GPUs - they really push the allowable bounds on accuracy.

I'll close the issue as I don't think Futhark is doing anything wrong.