chapel-lang / chapel

a Productive Parallel Programming Language
https://chapel-lang.org
Other
1.78k stars 418 forks source link

Implicit conversion from single to double precision is wrong. #18888

Open skaller opened 2 years ago

skaller commented 2 years ago

Chapel currently allows an implicit conversion from real(s) to real(t) if s<=t. This is backwards. I recommend disallowing all conversions between real (and hence complex) types. It is a common mistake.

A floating point number basically represents a range between the next lower and next higher representable number, therefore it can be thought of as partitioning the real number line into subranges. A higher precision number supports more equivalence classes and thus a finer partition. These smaller ranges contained inside the larger range of a lower precision number can be embedded in the larger range, in OO terms we can say each of these smaller ranges "isA" member of the larger range.

Therefore the correct rule is actually t<=s. The intuition is that more bits means more precision and so a lower precision value can be converted to a higher precision without losing information. But this intuition is completely wrong. Consider an approximation process in which we get better and better approximations by iteration, for example, the usual way to solve the eigen problem. Now, the larger the error tolerated in the result, the faster we converge. So again, the longer running processes are actually embedded in the shorter running ones; refinements of solutions can be embedded in the solution they're refining.

I apologise for not citing an academic reference (I know they exist)

damianmoz commented 2 years ago

I question all implicit conversions. I am unsure that wrong is the correct adjective.

I would like a compiler flag to be able to turn off all implicit conversions. Not now. But sooner rather than later.

The implicit conversion within a 32-bit floating point expression of a 32 bit integer (that goes nowhere near 231-1) to a real**(64) totally screws up my algebra. It has been the only time I have sworn at Chapel under my breath in 10 years. At best, there needs to be a warning. Certainly for people converting codes from Fortran, especially old Fortran, the Chapel implicit conversions are a huge minefield. If I am using 32-bit arithmetic, my error analyses rely on that to happen.

When I declare something to be int(32),, 90-95% of the time, they will never exceed 2^23-1, i.e. are 24-bits. I will never have the need to use 64-bit arithmetic with 32-bit integers and 32-bit floating point numbers. If I am the latter, I will demand they stay the latter unless I explicitly upgrade the precision. The need for the default Chapel floating point conversion is at best, irrelevant, and at worst, burdensome or crippling.

That said, there are lots of people who program for convenience and they need implicit conversions to be productive. So they might use an option

chpl --implconv=lazy ......

They do need to be careful.

Somebody like me wants

chpl --implconv=none

while others might want

chpl --implconv=f77

That said, anybody who uses any form on implicit conversion should at some stage turn off all implicit conversions and look at what warnings they get, just in case they are doing something silly. Or maybe that task goes into a static analysis tool for Chapel that somebody is yet to write.

skaller commented 2 years ago

When I first did Felix I had NO implicit conversions. However, in Felix a literal is typed, for example 1 is an int, and 1ul is an unsigned long. A lot of loops written like for(uint32 i = 1; i < 100; ++i) failed to typecheck. Also in Felix, every C int type, including typedefs, are distinct types. This ensure the semantics are platform independent. Many have specific literal suffices but I can never remember them and have to look them up in the code.

In the end, I added subtyping to the type system for other reasons, and decided it was reasonable to convert signed integers from smaller to larger numbers of bits, because they're intuitively all algebraic integers provided computations don't overflow. Subtype coercions are always implicit in my system.

supertype vlong: long = "(long long)$1";
supertype long: int = "(long)$1";
supertype int : short= "(int)$1";
supertype short : tiny = "(short)$1";

supertype int64: int32 = "(int64_t)$1";
supertype int32 : int16 = "(int32_t)$1";
supertype int16 : int8 =  "(int16_t)$1";

supertype int128 (x: int64) => x.int128;
supertype int256 (x: int128) => x.int256;

These are all the implicit conversions. Note the RHS strings define the emitted C++ for the operation. Subtyping is transitive, the compiler can convert int8 to int64 implicitly.

Explicit conversions can do anything, so the programmer can shoot themselves in the head, provided they post a warning notice. Here is the module defining Int64:

open class Int64
{
  ctor int64: string = "static_cast<#0>(::std::atoi($1.c_str()))" requires Cxx_headers::cstdlib;
  ctor[T in reals] int64: T = "static_cast<#0>($1)/*int.flx: ctor*/";
  ctor int64: int64= "$1 /*int64 ident*/";
}

Felix compiler does't know what integers are. So there's no need to specify the semantics .. they're specified by actual code in the library. Type classes (interfaces in Chapel) are used to define the basic operations, for example, addition.

Once you have powerful machinery you can dump a lot of stuff out of the language core into the library, which means the user can define their own stuff, or, more likely, the developers can change the rules in a few seconds without touching the compiler.

mppf commented 2 years ago

I am unsure that wrong is the correct adjective.

I agree with this & I'm having trouble understanding what is "wrong" about it. Certainly there are different ways to think about floating point numbers but this issue has not yet convinced me that it is "wrong" or even that it would be more productive for our users to remove real(32) -> real(64) implicit conversions.

Replying to @damianmoz :

The implicit conversion within a 32-bit floating point expression of a 32 bit integer (that goes nowhere near 2**31-1) to a real(64) totally screws up my algebra.

I might be possible to stop doing int(32) -> real(64) implicit conversions, but we have int(64) -> real(64) conversions (the spec says this is an exception for convenience) and we have int(32) -> int(64) conversions. It seems likely (but I am not certain) that we need the implicit conversions to include transitive relationships, so if we have both of these, we probably need int(32) -> real(64).

In my opinion though we could remove the int(64) -> real(64) conversions as long as we can keep param conversions for it (where we allow a literal like 1, which has type int, to implicitly convert into a real). We are trying to stabilize the language though (and stop doing breaking changes like this).

damianmoz commented 2 years ago

Any literal constant must have the largest size of that type. That is the only way to avoid the nonsense that C/C++ has to go through to specify long double (and long long) constants. Without that, once we have to get Chapel working on hardware that natively support 128-bit integers and 128-bit basic floating point types (that now exist), the language will be in a mess. See #18599 for more on that.

So, except when the appearance of a literal numeric constant in an expression is the entire expression, that appearance should not affect the precision at which the expression is evaluated, or at least give me the choice to achieve that with a compiler flag (which I will then mandate be used by all our programmers)

I hope I was not saying that removing implicit real(32) -> real(64) would be more productive. But a compiler option to disallow it is needed because it can be both a massive source of errors and a performance killer where the appearance of a real(64) identifier in the expression was a programmer mistake because the programmer wanted arithmetic done with 32-bit floating point vectors.

Here, i.e. in our office, we mandate that no code that we write uses (or relies on) implicit conversion. For example, the following

var x = 10.66666666666666666:real(32);
var y = 20.66666666666666666; // assume the underlying does not support **real**(128)
var z = x * y;

will score the programmer the dork-of-the-day award in our place. Sadly, sometimes me. The questions asked by the above (and answers) are:

Do I want `x * y` evaluated in 32-bit or 64-bit arithmetic? The answer is probably 32-bit.

Why is 'y' even declared in 64-bit arithmetic? The answer is probably programmer error.

Do I really want the overhead of that conversion to occur? The answer is probably no.

I try and program so that my code is accurate at 32-bits (and I can use 32-bit floating point vector arithmetic).

If I am not doing that, I try and program generically.

I am not proposing a breaking change to the default usage. I am sure the current implicit conversions were done for convenience and I do not want to rock the boat. But please don't mandate that I must choose the default usage. It is almost as silly as mandating that people do not use IEEE 754 floating point (which Chapel luckily does not do because its has --ieee-float).

But convenience normally causes errors which stab you in the back, and it has happened to me many times and not just in Chapel. One on our standard code review rules looks at the assembler to ensure no conversion between real(w) and real(w') happens unexpectedly (noting that w != w').

So there needs to exist compiler option that allow enforcement of stricter (or no) forms of implicit conversion, subject of course to the first rule where literal constants evaluate to the maximum precision supported by the underlying hardware (or to whatever precision is specified on a command line) and do not cause any implicit conversion of the expression in which they appear, i.e. evaluate to the maximum precision of the identifiers in that expression.

As an aside, how many programmers recompile their floating point programs with different rounding modes and then check that their results are still consistent between those rounding modes?

skaller commented 2 years ago

The problem is, in my opinion, that Chapel has a long history. So it has a LOT of archaic ideas, and a jumble of features, mixed up with some very modern high level ideas. This is typical of a language with a history: languages are political entities which soon turn into religious ones.

skaller commented 2 years ago

Damien said: "Here, i.e. in our office, we mandate that no code that we write uses (or relies on) implicit conversion."

This is overkill. Implicit coercions are useful. It's just that they should at least be justified by algebra including a model and proof, and in case of numerical analysis not just proof of correctness, but also proof of performance.

An obvious example where we want implicit conversion is pointers or references to classes.

An obvious case we in general do NOT want them is in any kind of arithmetic, unless we have a very specific model. For example you do NOT want it for unsigned integers because they are NOT integers. They're typically integers "modulo 2^(8*Nbits)" and such values do NOT have any embeddings. 8 bit math does not embed into 16 bit math. In fact, if you do some group theory you could argue it is the other way around (for addition). In the end, implicit coercions are just not justified.

skaller commented 2 years ago

Just BTW: the idea a literal should be "the biggest integer type" would work in C. But it causes very serious problems in any language with overloading. For example if var x:int32 then x + 1 has type int256 because of the usual promotion rules. Ok, so you want the literal to be type "int_literal" which somehow adapts to context. Well that's REALLY HARD to figure out how to do. I think Swift does it. But you have really nasty problems with parametric polymorphism if your argument has an indeterminate type without a context because a polymorphic function doesn't provide that context.

damianmoz commented 2 years ago

It was mentioned

Just BTW: the idea a literal should be "the biggest integer type" would work in C. But it causes very serious problems in any language with overloading. For example if var x:int32 then x + 1 has type int256 because of the usual promotion rules.

Your issue started talking real. I was talking real. Let's stay with real.

There is no such thing as an real32 or real256 in Chapel. These would-be real(32) and real(256) respectively

I should have qualified my earlier post that I was only talking about arithmetic expressions.

I thought I mentioned that in the context of #18599, literals should not affect the the type of the expression unless they are explicitly typed, so if x.type == real(32), then assuming that the largest floating point type supported by the hardware (and which Chapel exposes and is not downgraded by a compiler option) is 128 bits wide, then

x + 1.0 // has type real(32) because 1.0 would be assigned the type of the rest of the expression

but

x + 1.0:real(128) // would be an expression of type real(128) if implicit conversion is allowed

while it would fail if implicit conversion is disallowed.

If I really wanted to write it as an real(128) expression, and I (have a compiler option to) disable implicit conversion, I could write

x:real(128) + 1:real(128)

But I would not.

If I really was working in 128-bit floating point precision but knew that I had to use a variable x which had a different type, I would actually write

param one = 1:real(128);
...
x:real(128) + one;

The static analysis should inform me when I (mistakenly) write an expression with elements of different type. Most of the time though, having some elements of an expression of a different type to other elements will be a mistake on my part. If I have a choice of convenience over correctness, I will always choose the latter.

But at the moment, a lot of that discussion is moot because #18599 is still on the backburner.

skaller commented 2 years ago

The issue is not Chapel specific. Overload resolution is complex and there are several classes of algorithm used. Generally, you use unification and S-attributed types. Now you said:

The static analysis should inform me when I (mistakenly) write an expression with elements of different type.

The problem is, this is desirable and makes sense in some context, but in the general context of overload resolution it is nonsense. A function can have 6 arguments of different types, and +, -, * etc are just functions.

So the question is, how would you achieve the result you want? If you have functions for + - * which only accept the same types for both arguments, and you have no implicit conversions at all, then you will get the result you want.

If you have what C++ has: multiple overloads, constructor conversions AND on top of that operator conversion, you have an extreme mess. This is why in general implicit coercions, if you have to have them, should be based on some algebraic rules and not simply chosen ad hoc based on some notion of convenience or compatibility with historical stupidity.

In Felix, for example, implicit conversions occur as a result of sub typing. For primitive types, the subtyping relations are user defined, and they're strictly required to be transitive and should be embeddings. Unlike C++, Felix considers any chain of subtyping coercions as single coercion (by transitivity) and uses some kind of heuristic to pick the shortest chain, however that is never used to choose an overload. Either there's a coercion required or not.

I find that situation quite hard to manage. For example, with several sizes of floats and also complex numbers, you want floating point subtyping rules to "agree" with complex ones. And also have real to complex coercions. And on top of that you have overloads. I couldn't make it work! Felix has a simple, algebraically sound rule for subtyping, and in the end I removed all implicit coercions except for signed integers.

mppf commented 2 years ago

@damianmoz -

I thought I mentioned that in the context of #18599, literals should not affect the the type of the expression unless they are explicitly typed, so if x.type == real(32), then assuming that the largest floating point type supported by the hardware (and which Chapel exposes and is not downgraded by a compiler option) is 128 bits wide, then

x + 1.0 // has type real(32) because 1.0 would be assigned the type of the rest of the expression

but

x + 1.0:real(128) // would be an expression of type real(128) if implicit conversion is allowed

while it would fail if implicit conversion is disallowed.

This idea did not come across to me in #18599 and might deserve its own issue. I think it's intriguing and it would be possible to implement it, as far as I know, with only changes to the standard modules. But, it would be a language change to apply it universally. If you wanted to just apply it within your code, most likely we would need that to be some way you decorate your modules, rather than a compiler flag. (Because a compiler flag would also apply to modules that you didn't write that you are using).

damianmoz commented 2 years ago

The whole idea is that is actually is a compiler flag and that it applies to modules being included (or used) but which I did not write.

bradcray commented 2 years ago

But it causes very serious problems in any language with overloading. For example if var x:int32 then x + 1 has type int256 because of the usual promotion rules.

In Chapel x + 1 would have type int(32) because 1 is a param int(64) and we support downcasting of param values when the value fits into a smaller type, similar to C# from which we took the inspiration.

skaller commented 2 years ago

Right. I get that you're trying to make the integer literal adapt to context. I'm just not sure it works well when you have both overloading and also type conversions. I have real a paper once which showed a system that had both, and proved that with some restrictions, the resulting type system was sound. That was a very complicated paper! The notion that param are adaptive is a good one, if you can pull it off. i mean, given you can put them on the command line, you can't expect the user to put an exact type in, so you actually do need params to be adaptive.

Other languages that do this, and increasingly even C++, are making a special type for literals, so they are always converted, since no variable or non-literal expression ever has that type. That's another approach.

Implicit representation conversions are always a pain. There is argument for whatever sort of works reasonably. I think this is not the same really as implicit type conversions. For example if you have two sizes of floats .. well they're really representations of the same type, namely real.

Anyhow there's a good argument for leaving the float to double conversion as it is. I think there's an even better argument for removing all these conversions from the compiler, and putting them into the library, possibly requiring a language extension, because that makes it easier to change, and possibly for the user to do the change, and do different changes in different contexts.

I think I will close this. It's not a fundamental issue.

mppf commented 2 years ago

I wrote

This idea did not come across to me in #18599 and might deserve its own issue. I think it's intriguing and it would be possible to implement it, as far as I know, with only changes to the standard modules.

@bradcray wrote -

In Chapel x + 1 would have type int(32) because 1 is a param int(64) and we support downcasting of param values when the value fits into a smaller type, similar to C# from which we took the inspiration.

Ah, that is right, I hadn't remembered exactly what we were doing here. E.g.

var x: int(32) = 1;
var y = x + 1;
writeln(y, ":", y.type:string);

prints out 2:int(32).

mppf commented 2 years ago

Replying to @damianmoz -

The whole idea is that it actually is a compiler flag and that it applies to modules being included (or used) but which I did not write.

In that case, wouldn't you either: a) expect the standard library and any 3rd-party modules you use to adhere to this standard, or b) be unable to use the standard library or 3rd-party modules?

If it's (a) then it seems to me that you are actually arguing for a language change. If it's (b) then I don't see how the idea is workable because I wouldn't expect you would want to take on implementing your own standard library.

skaller commented 2 years ago

Replying to @damianmoz -

The whole idea is that it actually is a compiler flag and that it applies to modules being included (or used) but which I did not write.

In that case, wouldn't you either: a) expect the standard library and any 3rd-party modules you use to adhere to this standard, or b) be unable to use the standard library or 3rd-party modules?

If it's (a) then it seems to me that you are actually arguing for a language change. If it's (b) then I don't see how the idea is workable because I wouldn't expect you would want to take on implementing your own standard library.

This is why you need to remove all the implicit conversions from the compiler and put them in a module in the library. Now, the standard modules will use whatever implicit conversions they want in their implementations but we do not give two hoots about that, we only care about calling functions from the standard library and functions in our own code. So we can use a different set of implicit conversions. We just define them or lift chosen ones out of the standard library. The library could even have every possible conversion defined and selected ones actually enabled.

Moving stuff out of the compiler into the library has to be good: it simplifies the compiler by removing a whole set of ad hoc type dependent calculations from it, and then allows the calculations to be done using a more general, user programmable method. General methods are always easier to implement that special ones (because they always have a solid computational algebra behind them). And of course it removes user complaints about the particular choice of implicit conversions.

mppf commented 2 years ago

I have created #19194 about the specific proposal of removing int -> real implicit conversions.

bradcray commented 2 years ago

I should've said earlier, but FWIW:

It seems likely (but I am not certain) that we need the implicit conversions to include transitive relationships

I don't necessarily think that an implicit conversion from t1->t2 and one from t2->t3 implies that there should be one from t1->t3 (neither in the sense of "the type author should've definitely provided it and was mistaken not to" nor "the compiler should implement the transitive closure of implicit conversions and do this for you automatically").