Open 3f18db19-85d0-42b5-b58f-dbfbd8cbce51 opened 10 years ago
Re-suggest instead using the __float80 type to solve the problem. The precision isn't needed, but the other option is some even uglier SSE to handle it.
I'd suggest using SSE. The problem is compiler choice in use of float sizes and that solves it. 80 bit isn't reasonable, although float32 or 64 is an option.
http://cottonvibes.blogspot.com/2010/09/using-volatile-keyword-to-prevent-float.html claims that the volatile reload forces a 32 bit round. This seems fishy from my memory of x87 and osdev claims the opposite:
http://wiki.osdev.org/FPU claims instead,
"Sending data to, and pulling data from the 8 FPU registers, ST(0) through ST(7), must be performed using system memory. It is not possible to directly copy values from a CPU register to an FPU register. The FPU can copy data from/to system memory in the following formats: 16-Bit Integer, 32-Bit Integer, 32-Bit Float (single), 64-Bit Float (double), and 80-bit Float (extended double). The FPU also supports reading and writing a 80-bit Binary Coded Decimal (BCD) format, which contains a single "sign" bit, 7 reserved bits, and 18 four-bit hexadecimal "characters".
When reading values from system memory, the extended double format is copied directly into the FPU register, while the other formats are converted to the 80-Bit extended double format before being stored in the FPU register. When writing values to system memory, the 80-bit value is copied directly when storing the extended double format, and is converted to the appropriate structure for the other formats. This conversion includes rounding the value based on the current rounding settings in the FPU Control Register. "
It seems that storing the memory stack as 80 bit float on x86 removes the rounding that otherwise occurs. A similar situation would exist on ARM and most other processors so this may be a solution.
assigned to @dexonsmith
Extended Description
In r206765 I added a volatile float to work around buildbot failures from hidden precision in x87.
This is a horrible hack and should be removed somehow, ideally by removing the use of
float
for spill weights entirely.