Closed Ravenwater closed 6 years ago
Deeper analysis shows that you need many more bits in the reference calculations than IEEE floating point hardware provides.
A posit<48,2> has maximally 44 bits of fraction including the hidden bit. A multiply will generate 88 bits of fraction bits. Even an 80bit x87 long double only returns 64bits of fraction. So rounding errors will occur and the test suite is generating potentially incorrect reference values.
This analysis shows that even smaller posits, like posit<32,2> might have these issues.
The failures currently recorded are caused by the test bench not being able to compute at 90bits of precision. Fixing this will require enhancing the test suite with a multiprecision reference float implementation.
The reference verification library will be split off the functional library and will integrate an arbitrary precision reference.
Likely related to the failures of 64bit_posit, but unclear where these failures come from as the fraction size of a posit<48,2> is covered by a regular double.