Open sharkdp opened 1 year ago
Hey @sharkdp, how hard would it be to implement this in Numbat? I saw in the Insect discussion that it would imply many changes. I don't know any Rust TBH, but I find complex arithmetic to be so useful (I see myself wasting time doing it in IPython or in the Julia REPL) that would like to try and implement it.
@Panadestein Thank you for the feedback. I am definitely in favor of implementing this, but I probably won't work on this myself. Something I definitely want to have is multiple numeric types in Numbat. For now, we just have a single "Number" type that represents floating point numbers. But I think this should be expanded to integer types, rational types, and complex types.
Before we start implementing complex numbers, we should probably figure out how to deal with multiple numeric types in general. Maybe by trying to add integers to the language. It's currently not clear to me how we would even write down the types. Length
is currently a floating point of dimension length, without specifying that "floating point" part. Maybe we could have Float<Length>
, Integer<Length>
, etc. where Length
would be a shorthand for Float<Length>
?
And this is just the syntax. I'm sure more questions will come up once we start implementing it.
Is there any update on the multiple numeric types?
Otherwise, I believe there is a point to be made that complex numbers are not "another numeric type" but should be used as the base unit of a scientific calculator. There is not much harm in having all numbers complex at all times, the imaginary part can always be omitted for display if it is zero. This would for example also mean that square roots of negative numbers are possible instead of returning NaN. Complex numbers even make sense with units in certain real live applications (like circuit theory) and even where they don't make sense, it is not the job of a calculator to check the user's semantics (i.e. -1kg works just fine). If you would be fine with such a change, I would definitely be willing to take a look at the code and help out!
Is there any update on the multiple numeric types?
Not yet, no.
Otherwise, I believe there is a point to be made that complex numbers are not "another numeric type" but should be used as the base unit of a scientific calculator. There is not much harm in having all numbers complex at all times, the imaginary part can always be omitted for display if it is zero.
Yes, I think this is a valid argument. The basic question is: should the distinction between real and complex numbers be tracked at the type level or not? Since real numbers are a subset, you are basically arguing that we might as well turn the standard numeric type into a type that allows for complex numbers instead of just real numbers.
I could imagine that there are situations where it would be helpful to be more restrictive, i.e. to have a separate datatype for numbers that can only be real numbers, not complex ones. For example, think of a function that solves quadratic equations. If we only have complex numbers, it would be reasonable to return exactly two complex numbers (since every quadratic equation has two solutions in the complex plane). But maybe in a particular application, I'm only interested in real-valued solutions to the equations. So it might make sense to have two variants of this function. One that returns exactly two Complex<…>
numbers. And one that returns zero, one or two Real<…>
numbers.
An easier example would be the sqrt
function. The version that works on Complex<…>
numbers would return a result unconditionally. And the version that works on Real<…>
might throw an error if the input would be negative. Or return a Optional<Real<…>>
.
But to come back to your point, I think that your approach is very reasonable. If it turns out that we want this distinction at the type level, we could still add that later, I think.
it is not the job of a calculator to check the user's semantics
This is a statement that I very much disagree with. The whole purpose of Numbats unit system is to do precisely that: to check the semantics of the users computation. 1 m + 1 kg
is a computation that is semantically wrong, and we return an error. True, there are some things that we can not check ("-1 kg" is probably almost never useful). But we definitely strive to add even more ways to make sure that users never run calculations that they did not intend semantically.
If you would be fine with such a change, I would definitely be willing to take a look at the code and help out!
Sounds fantastic. I could imagine that there are still a lot of places where we have an (implicit) assumption that the numeric type is f64
. So it might be useful to do a little bit of refactoring (first). We also plan to switch the numeric type from f64
to something that supports a higher precision (see #4). So maybe that's something to keep in mind while working on the numeric type anyway.
Thank you for the long answer!
This is a statement that I very much disagree with. The whole purpose of Numbats unit system is to do precisely that: to check the semantics of the users computation.
1 m + 1 kg
is a computation that is semantically wrong, and we return an error. True, there are some things that we can not check ("-1 kg" is probably almost never useful). But we definitely strive to add even more ways to make sure that users never run calculations that they did not intend semantically.
I agree with what you are saying, maybe I am nitpicking too much about the difference between syntax and semantic. The addition of two different units is syntactically invalid (which we should very much catch, yes) but also semantically wrong. But there are cases that are semantically wrong (often context dependent) but syntactically valid that I was trying to talk about.
Sounds fantastic. I could imagine that there are still a lot of places where we have an (implicit) assumption that the numeric type is
f64
. So it might be useful to do a little bit of refactoring (first). We also plan to switch the numeric type fromf64
to something that supports a higher precision (see #4). So maybe that's something to keep in mind while working on the numeric type anyway.
Thanks for the pointers! I will try to dig through the code-base and see how much work would be required, keeping the goal to add more abstraction in mind.
I agree with what you are saying, maybe I am nitpicking too much about the difference between syntax and semantic. The addition of two different units is syntactically invalid (which we should very much catch, yes) but also semantically wrong.
I have no problem with nitpicking, I often find myself guilty of the same :smile:. It doesn't change your arguments, but it's not quite correct what you are saying. The expression 1 m + 1 kg
is syntactically correct. It's perfectly fine according to the grammar of the Numbat language. The parser will generate a valid AST for that expression.
It's only one stage later in the compiler (in the so-called semantic analysis stage) that we actually detect that this expression is problematic. The type checker will see that we are attempting to add an expression of type Length
to an expression of type Mass
, and that leads to a type check error.
But there are cases that are semantically wrong (often context dependent) but syntactically valid that I was trying to talk about.
An expression like -1 kg
is both syntactically correct and also correct according to Numbats current semantic analysis stage. It's also not possible to encode something like "negative masses should be forbidden" in the type system, since that would depend on the value of certain expressions. What would we do with an expression like complicated_function_that_returns_scalar() × kg
or get_command_line_option("mass") × kg
? Static code analysis can not say anything about the value of those masses.
I have a need for complex numbers when doing refractive index calculations where you have a real index of refraction as well as an imaginary "coefficient of extinction". Having good support for basic complex numbers operations such as addition, multiplication, |z| (norm) would be really helpful.
I just spent a while on playing with it and it seems like more or less the full number type abstraction refactor would be necessary to make it work (at least nicely without randomly lost imaginary parts). The good news is that if we use the num::Complex datatype as the general base, it seems to me like this would do a lot of the prep work and make it way easier to swap out the underlying number type later as we would call all operation on num::Complex (which then would take care of passing it on to whatever type is inside of it).
I am sure that there will be cases where general complex values are invalid so we would have to add more Results
with the possibility to throw a "number is not real" error. (as long as there is not a more generic numerical type system in the numbat language as you outlined above)
One more noteworthy detail: Moving sqrt() from a numbat function to a builtin function might make sense if we move to complex because num::Complex has some special handling in that case.
+1 for this feature.