stillwater-sc / universal

Large collection of number systems providing custom arithmetic and mixed-precision algorithms for AI, Machine Learning, Computer Vision, Signal Processing, CAE, EDA, control, optimization, estimation, and approximation.
MIT License
391 stars 58 forks source link

Switched to use posito when testing specialized posit<16,2> #409

Closed davidmallasen closed 2 months ago

davidmallasen commented 5 months ago

Following issue #343. Everything seems to be working fine, but I still get different results in my application. Maybe the problem is not in the basic operations but in some other detail.

Ravenwater commented 5 months ago

@davidmallasen I got the new api in branch v3.76

But as I was prepping for the merge, I had new thought: 1- randoms are unlikely to catch corner cases 2- we tested the arithmetic operators exhaustively 3-the regression test shows float conversion failures

From those observations, it appears that our RCA between behavioral differences between fast posit<16,2> and posito<16,2> should first resolve the float conversion failures.

Ravenwater commented 5 months ago

@davidmallasen the oracle concept to be a reference type in the regression suites needs more thought. What is the 'right' Oracle type? Will such Oracle types depend on the nature of the type under test?

To research those issues, I will continue to operate under posito<> and explore these issues before jumping into a GoldenType template parameter.

Ravenwater commented 5 months ago

And the saga continues:

posit conversion validation: results only
posit<5,2>                                                   conversion PASS
posit<6,2>                                                   conversion PASS
posit<7,2>                                                   conversion PASS
posit<8,2>                                                   conversion FAIL 484 failed test cases
posito<8,2>                                                  conversion PASS
posit<9,2>                                                   conversion PASS
FAIL = 0.06251519918             did not convert to 0.06253051758             instead it yielded  0.0625                     raw 0b0.01.00.00000000000
FAIL = 0.999878943               did not convert to 0.9997558594              instead it yielded  1                          raw 0b0.10.00.00000000000
posit<16,2>                                                  conversion FAIL 2 failed test cases
FAIL = 0.06251519918             did not convert to 0.06253051758             instead it yielded  0.0625                     raw 0b0.01.00.00000000000
FAIL = 0.999878943               did not convert to 0.9997558594              instead it yielded  1                          raw 0b0.10.00.00000000000
posito<16,2>                                                 conversion FAIL 2 failed test cases
posit conversion validation: FAIL

Looks like there is a bad cast assumption going on in the stack.

davidmallasen commented 5 months ago

Hello @Ravenwater. For a generic Oracle type, I don't think the Oracle types should depend on the type under test. For this I would use something like MPFR, although I don't know how complicated it would be to include in Universal. However, in the context of the specialized posit, we do need some way to avoid what happened between posit<16,1> and posit<16,2>, so comparing the bit strings with the stable posit bitblock implementation would still be the way to go.

Thanks for your time on this, and great that you found where the problem is! I didn't get those errors when launching the tests the previous time I don't know why. That could be the problem that I saw where the results were not exactly the same but differed a bit between the specialized/non-specialized versions. Would this be some error in https://github.com/stillwater-sc/universal/blob/60fe91ca69edcbbd4386542e80fbf62e623ad09e/include/universal/number/posit/specialized/posit_16_2.hpp#L696 ?

Ravenwater commented 5 months ago

@davidmallasen :-) Nothing to do with either rounding or casting in the posit implementation: it was a bug in the regression test:

posit conversion validation: results only
posit<5,2>                                                   conversion PASS
posit<6,2>                                                   conversion PASS
posit<7,2>                                                   conversion PASS
posit<8,2>                                                   conversion FAIL 484 failed test cases
posito<8,2>                                                  conversion PASS
posit<9,2>                                                   conversion PASS
posit<16,2>                                                  conversion PASS
posito<16,2>                                                 conversion PASS

the specialized posit<8,2> does use a different conversion algorithm for the sake of speed, and that still needs some TLC.

But that leaves still a hole in our theory what to test to RCA the difference between posit<16,2> and specialized posit<16,2> in your application.

Any other ideas? Can you describe the computational path you are seeing differences in?

Ravenwater commented 5 months ago

@davidmallasen and concerning an Oracle type, we are working on a fast Priest arbitrary precision type a la MPFR that could be utilized across Universal. One of the requirements of Universal is that it has no external dependencies, and thus linking in a static library like MPFR is a non-starter.

davidmallasen commented 5 months ago

@Ravenwater I'll have to try to pinpoint where the two operations converge. When I get some time I'll try to narrow it down a bit so I can share where the discrepancy is. I imagined adding MPFR would be a no-go 👍🏽