Non-normalized floats in replication

rrrlasse commented 11 years ago

Since I don't know how to write comments in a non-pullrequested branch, I'm writing it here.

inline char* Replication::encode_float(char* ptr, float value) { TIGHTDB_STATIC_ASSERT(std::numeric_limits::is_iec559 &&

^ floats and doubles can be represented in non-ieee format if they are non-normalized, so remove this assert.

kspangsege commented 11 years ago

I don't agree (or I don't understand what you mean).

Without the assert that you mention, the float representation could be anything. If two devices uses two different representations, they cannot exchange floats in their native representation. The assert is an easy (but rather restrictive) way of ensuring that the representation is the same on both sides.

On Mon, Apr 29, 2013 at 6:04 PM, Lasse Reinhold notifications@github.comwrote:

Since I don't know how to write comments in a non-pullrequested branch, I'm writing it here.

inline char* Replication::encode_float(char* ptr, float value) { TIGHTDB_STATIC_ASSERT(std::numeric_limits::is_iec559 &&

^ floats and doubles can be represented in non-ieee format if they are non-normalized, so remove this assert.

— Reply to this email directly or view it on GitHubhttps://github.com/Tightdb/tightdb/issues/87 .

rrrlasse commented 11 years ago

std::numeric_limits::is_iec559 is not reliable - we saw that when we investigated floats in mixed. It does not account for changing normalization mode during runtime, and it does not respect --fast-math flags given to GCC and does not even respect Intel C++ compilers' own default setting (i.e. no special compiler flags) which is to enable non-normalized math.

So I fail to see the point of that assert.. Also, how would it ensure anything if it doesn't perform any runtime check on 'value', if the float comes from the outside world, outside the library?

kspangsege commented 11 years ago

Strictly speaking, all the guarantees of the standrd are void as soon as you do something that has an undefined effect according to the standard.

In particular, changing the "hardware" representation of floats, or storing a float value using a foreign representation are both examples of something that has undefined behaviour. So in an application that does such things, the value of numeric_limits::is_iec559 is useless.

The reasoning that I did is as follows:

Most relevant platforms support IEEE. Most compilers/runtimes on those latforms set numeric_limits::is_iec559 to true. Most applications do not render the value of numeric_limits::is_iec559 unreliable.

The worst that could happen is that some platform sets numeric_limits::is_iec559 to false while still supporting IEEE. In that case our library will fail to compile for no good reason. If that ever happens, I think the best approach will be to detect that specific platform with #ifdef and provide a work-around.

I'm much less concerned about cases where people have deliberately violated the apparent float representation, because in those cases one would expect all sorts of issues.

On Tue, Apr 30, 2013 at 1:49 AM, Lasse Reinhold notifications@github.comwrote:

std::numeric_limits::is_iec559 is not reliable - we saw that when we investigated floats in mixed. It does not account for changing normalization mode during runtime, and it does not respect --fast-math flags given to GCC and does not even respect Intel C++ compilers' own default\ setting (i.e. no special compiler flags) which is to enable non-normalized math.

So I fail to see the point of that assert.. Also, how would it ensure anything if it doesn't perform any runtime check on 'value', if the float comes from the outside world, outside the library?

— Reply to this email directly or view it on GitHubhttps://github.com/Tightdb/tightdb/issues/87#issuecomment-17201944 .

rrrlasse commented 11 years ago

I completely fail to see the use case you are trying to avoid/detect. Use cases:

1) User compiles TightDB + Application with Intel C++. This will store, load and query on denormalized floats fine and well-defined even though the floats do not conform to iec559. We should not try to prevent this because denormalized floats are very popular due to performance (factor of 100 in some cases). But because we don't yet know the exact meaning of is_iec559 (we never found out back then), the assert is not guaranteed to pass.

2) User compiles TightDB + server-application with GCC and uses this to synchronize data between mobile phone applications using denormalized floats (denorm is default in the ARM world, afaik). Load and stores are well-defined on the server (bits are passed unchanged through regardless of invalid content) but queries are not. A server side test for iec559 makes no sense here.

Again, please tell if there are other use cases where a check would make sense.

kspangsege commented 11 years ago

I see two use cases (or rather anti-use cases):

1) When somebody takes our source to some alien platform that uses an alien float representation, the compile time error alerts him to the fact that we don't support it.

2) When in the future we choose to port our library to some new platform, and we get a compile time error due to that assert, it alerts us to the fact that we have some work to do on float representation.

This is just an instance of the following general idea:

For each assumption that is made during development, if at all possible, arrange the code in such a way that a visible error occurs if the assumption is violated. When the assumption is violated, the error should occur as soon as possible, preferably at compile time.

kspangsege commented 11 years ago

I agree that if ARM and Intel use incompatible float representations by default, then we have a problem regardless of what numeric_limits::is_iec559 says.

Also, your mention of ARM made me realize that I probably forgot to consider byte ordering in my implementation.

So in any case, we probably have to look into other ways of transferring floating point data.

How would you do it?

I have been considering to use frexp() to achieve platform independence, but of course it is not going to be as efficient.

On Tue, Apr 30, 2013 at 10:55 AM, Kristian Spangsege ks@tightdb.com wrote:

I see two use cases (or rather anti-use cases):

1) When somebody takes our source to some alien platform that uses an alien float representation, the compile time error alerts him to the fact that we don't support it.

2) When in the future we choose to port our library to some new platform, and we get a compile time error due to that assert, it alerts us to the fact that we have some work to do on float representation.

This is just an instance of the following general idea:

For each assumption that is made during development, if at all possible, arrange the code in such a way that a visible error occurs if the assumption is violated. When the assumption is violated, the error should occur as soon as possible, preferably at compile time.

rrrlasse commented 11 years ago

But Intel C++ Compiler is not an alien platform, it's a very popular C++ compiler which uses denormalized floats as its default setting (i.e. no special compiler flags). GCC using -fast-math is not alien either and makes it use denorms. ARM uses denorms alot and is not alien. Google for DAZ + FTZ to see more cases where denorms are used which do not conform to ieee.

Please answer: TightDB should compile in all above cases, right?

Now, as we tested a while ago, is_iec559() seemed to return always true regardless if denorms were enabled or not, at least in the quick test I made with ICC, GCC and VS on Windows. We wondered alot about this and I cannot guarantee that in other environments it would return false, hence blocking above use cases.

kspangsege commented 11 years ago

I would consider the Intel C++ Compiler a niche compiler. But I would love for TightDB to compile and work well in all those cases. It's all a matter of how hard it is to support vs how great the demand is. Also, there is no shame in lacking support for an important platform at the moment as long as there is an ambition and a plan to achieve support.

rrrlasse commented 11 years ago

http://en.wikipedia.org/wiki/Denormal_number : ... clang and gcc have varying default states depending on platform and optimization level. ...

So you don't want support for gcc/clang either?

kspangsege commented 11 years ago

I do want support for GCC and Clang. As said above, due to the byte ordering issue, we probably have to devise a whole new way of transferring floating point data anyway.

kspangsege commented 11 years ago

What do you think about using frexp()? That gives the exponent as an integer. The mantissa can then be converted to an integer by multiplying by numeric_limits<int64_t>::max() or some other large integer. The two integers can then be transferred the same way as the other integers.

astigsen commented 11 years ago

It look like protocol buffers just assume ieee:

http://stackoverflow.com/questions/7248950/how-cross-platform-is-googles-protocol-buffers-handling-of-floating-point-type

It is probably the most well tested and widely used binary serialization format out there, so if it works for them, then maybe it will be enough for us as well?

On 4/30/13, Kristian Spangsege notifications@github.com wrote:

What do you think about using frexp()? That gives the exponent as an integer. The mantissa can then be converted to an integer by multiplying my numeric_limits::max() or some other large integer. The two integers can then be transferred the same way as the other integers.

Reply to this email directly or view it on GitHub: https://github.com/Tightdb/tightdb/issues/87#issuecomment-17219778

kspangsege commented 11 years ago

Nice link. Lots of useful info.

On Wed, May 1, 2013 at 1:27 AM, astigsen notifications@github.com wrote:

It look like protocol buffers just assume ieee:

http://stackoverflow.com/questions/7248950/how-cross-platform-is-googles-protocol-buffers-handling-of-floating-point-type

It is probably the most well tested and widely used binary serialization format out there, so if it works for them, then maybe it will be enough for us as well?

On 4/30/13, Kristian Spangsege notifications@github.com wrote:

What do you think about using frexp()? That gives the exponent as an integer. The mantissa can then be converted to an integer by multiplying my numeric_limits::max() or some other large integer. The two integers can then be transferred the same way as the other integers.

Reply to this email directly or view it on GitHub: https://github.com/Tightdb/tightdb/issues/87#issuecomment-17219778

— Reply to this email directly or view it on GitHubhttps://github.com/Tightdb/tightdb/issues/87#issuecomment-17260643 .

realm / realm-core

Non-normalized floats in replication #87