Closed giordano closed 1 year ago
Ok, after fixing the literals 1
and 4
I now get
$ ./run.sh
rm -f codelets.gp codelets.ll codelets.S pi
g++ -O3 -Wall --std=c++11 pi.cpp -o pi -lpoplar
Using HW device ID: 0
Calculating PI using:
4294966272 slices
1472 IPU tiles
Obtained value of PI: 0.146298
Time taken: 0.197444 seconds (262599924 cycles at 1.33 GHz)
The result is still quite off because single precision is bad for this algorithm, but this is what it is. With fewer iterations one gets a much more accurate result:
$ ./run.sh
rm -f codelets.gp codelets.ll codelets.S pi
g++ -O3 -Wall --std=c++11 pi.cpp -o pi -lpoplar
Using HW device ID: 0
Calculating PI using:
42948544 slices
1472 IPU tiles
Obtained value of PI: 3.1416
Time taken: 0.00197445 seconds (2626014 cycles at 1.33 GHz)
In fun things, I'm getting pi ~6.something with this and poplar 3.3.0...
[uccaoke@mandelbrot cpp_ipu_pi_dir]$ ./pi
Using HW device ID: 0
Calculating PI using:
42948544 slices
1472 IPU tiles
Obtained value of PI: 6.25758
Time taken: 0.141946 seconds (262599924 cycles at 1.85 GHz)
Wait, actually I'm an idiot.
Output:
but hold on merging because I clearly did something stupid in the kernel to compute pi.