Closed tannergooding closed 4 years ago
/cc @dcwuser
I'm also happy to implement this end to end, as I did with the single-precision math functions.
I'm going to have a few negative things to say about the specifics of this proposal, but let me start off by saying that I am all for the .NET framework getting this (and more!) math functionality.
One significant problem with this proposal is that the set of functions listed is too big and heterogeneous to be handled as a single decision. I would prefer to have separate discussions for the following groups of functions:
Precision-loss-avoidance functions: hypot, log1p, expm1. I have proposed these in https://github.com/dotnet/corefx/issues/15921. They are used primarily by numerical computing experts in the implementation of algorithms for other functions. They are easy to implement with ~5-ulp guarantees, possible to implement with 1-ulp guarantees.
Simple trig completeness functions: acosh, asinh, atanh. These are used by non-experts. They are trivially related to already implemented functions, but providing them does allow us to deal with some overflow issues that occur if users implement them naively.
Binary power helper functions: exp2, log2. The would probably be used by non-experts. They are also trivially related to already implemented functions, and are basically analogs to exp10 and log10.
Floating point decomposition functions: frexp, ldexp, nextafter, fpclassify, isfinite, isnormal. These are used by numerical computing experts in the analysis, but usually not implementation of algorithms. It might be somewhat subtle to provide implementations that will work independent of ended-ness and other chip-level implementation details.
True advanced functions: erf, erfc, tgamma, lgamma. These are used by statisticians and scientists. There are some good implementations, but I am not aware of any that can make ulp-level accuracy guarantees.
Here are what I have to say about these groups:
(2) and (3) belong in System.Math, the others, in my opinion, do not. They have no meaning to non-experts and just increase the noise in an already very cluttered static class. I'd put all the others in classes in System.Numerics; in particular, (1) and (5) should go in a static class named MoreMath or AdvancedMath.
(4) is a very un-.NET API. It even contains an out parameter! It would be better for us to make it more object-oriented, for example to create a class:
public sealed class DoubleInfo {
public double Value { get; }
public int Exponent { get; }
public long Mantissa { get; }
public int Sign { get; }
public bool IsFinite { get; }
public bool IsNormal { get; }
public DoubleInfo Next();
public DoubleInfo Previous();
}
(5) tgamma and lgamma are terrible names. t means nothing and the right prefix for log is log, so they should be Gamma and LogGamma. Implementing these opens the door implementing a very long list of advanced functions; there are 29 chapters in Abramowitz and Stegun and 36 chapters at http://dlmf.nist.gov/! That's another reason I think we should establish a new place for these functions to go besides System.Math. We won't be able to provide the level of performance and accuracy guarantees for advanced functions that we can to the more traditional System.Math functions; we should have some discussion about what level of performance and accuracy we require before releasing an advanced function as part of the framework.
One way to view this proposal is as saying that we should implement all the C89 and C99 (https://en.wikipedia.org/wiki/C_mathematical_functions) math functions with exactly (or almost exactly) the same organization, naming, and syntax as cstdlib. Under that view of this proposal, I vote no. The .NET Framework doesn't surface other parts of cstdlib or the Windows API in this way. Instead, it offers a more thoughtful, object-oriented, and user-friendly API surface. We should take the same approach with math functions.
@dcwuser, this is why I split it out into two distinct sets of APIs.
The first set is the actual proposal, with types being added to the existing System.Math
and System.MathF
classes (these are Acosh
, Asinh
, Atanh
, Exp2
, ExpM1
, Log1P
, Log2
, Cbrt
, Hypotenuse
). These methods have all been given .NET "friendly" names (matching the naming conventions already used) and finish rounding out the APIs that are already available.
The second set were explicitly listed as being worth consideration. These additional APIs are defined by the C/C++ Standard Library (as are the APIs which are actually being proposed in the first set). They are still using the C/C++ naming convention and are not proposed to be part of any type. Additionally, some of them, such as fma and remquo, may be better suited to implicit intrinsic support rather than being available via an explicit call.
There is also a distinction between the first and second set in that the first set has to be implemented as an FCALL (as the existing math functions are) to provide any reasonable performance throughput. However, most of the functions listed in the second set (modulo erf
, erfc
, lgamma
, and tgamma
) could be implemented in C# using pointer arithmetic fairly trivially.
Finally, the only functions proposed here are functions defined in the C/C++ Standard Libraries (and only those that are missing from the set we currently expose). There are thousands of other constants and various functions that could be proposed or added, but they are usually part of a fairly specific library and are not exposed by the C/C++ Standard Library itself.
It should likely also be pointed out that all of the APIs that are part of the first set match the definition of Precision-loss-avoidance functions
(not just hypot
, expm1
, and log1p
).
Okay, let's set aside any discussion the second API set and concentrate on just the issues around the first.
Regarding naming, we should either decide that the weight of tradition is important enough that we keep the C names and at most adjust the casing (ExpM1, Log1P, Hypot), or that we go full-on .NET friendly (ExpMinusOne, LogOnePlus, Hypotenuse).
I do agree that there is a sense in which acosh is a precision-loss-avoidance (and overflow-avoidance) implementation of log(z + sqrt(z^2-1)). But still think that acosh, asinh, and atanh are different from log1p, expm1, and hypot, in that the former is going to be familiar and used by anyone with high school math, while the latter are going to be meaningless and confusing to those without a numerical computing background. At an API design level, this means that while the former manifestly belong in System.Math, while the latter arguably belong in a place for more advanced math functions, if there is ever going to be one. (One could instead take the view that we should try to encourage more widespread use of APIs like log1p, which would imply that it should go in System.Math, and probably also that we choose the LogOnePlus naming convention.)
Finally (and this is more nit-picky and more about implementation than API surface), I wouldn't be quite so certain that the only reasonable implementations of these APIs is to extern them to the local cstdlib implementations. If the local cstdlib implementations were reliably good and trustworthy, this would be true. But as evidenced by the atrocious (and, I claim, simply wrong) large-x behavior of sin and cos, this isn't the case. Also, the perf loss to do stuff like hypot or log1p correctly in managed would be very low. For example, if the Windows cstdlib correctly does the 1-ulp version of log1p (which required a lookup table), then by all means let's extern to it. But if it doesn't, let's do it right (and portably!),
Regarding naming
Its not just a question over tradition vs friendliness. The existing convention has been to use the name as specified by IEEE 754, with the casing adjusted, the same convention should be continued. The IEEE names themselves follow a standard convention, as exists on calculators and in standard mathematical notation. This is why we use Sqrt
instead of SquareRoot
.
the former is going to be familiar and used by anyone with high school math, while the latter are going to be meaningless and confusing
The functions proposed can be found on the TI-30 and TI-80 calculators (both standard for various High School and College math courses), they are fairly core functions for a large range of mathematics and are not by any means advanced.
The C/C++ implementations, on the whole, follow the IEEE standard as well. They do so for the domain, exceptions, and limitations. The way the functions behave for large and small inputs is due to the limitations of the binary representation format. If you are using math functions when writing a computer program, you should be aware of the limitations of the binary floating-point representation format and therefore the limitations of the input domain for the functions you are calling.
For example, a raw number is only representable for 6 digits (binary32) or 15 digits (binary64). Multiple operations compound on this limitation and result in imprecision for complex calculations. The more complex the calculation, the less precise the result will generally be. You can provide more precise computations by using a larger representation and downcasting or using a more precise algorithm, but this generally leads to a significant enough perf decrease to not make it worthwhile.
I wouldn't be quite so certain that the only reasonable implementations of these APIs is to extern them to the local cstdlib implementations
The Linux libm implementation that we are consuming right now, for many of these core functions, is in C/C++ rather than in raw assembly (Mac and Windows are in raw assembly). The performance decrease from doing so can be seen in this issue here: https://github.com/dotnet/coreclr/issues/9373. If we were to implement these in C#, the performance decrease would be even greater (as we don't have the necessary SSE/SSE2 intrinsic support to do this properly). The remaining choice is to either use the CRT implementation that is going to be already tried and true or to roll our own (for each target architecture).
Regarding naming: IEEE 754 (https://www.csee.umbc.edu/~tsimo1/CMSC455/IEEE-754-2008.pdf) says "the names of the operations in Table 9.1 do not necessarily correspond to the names that any particular programming language would use", so the standard doesn't require these names. It's true that we have followed these naming conventions mostly. But your own proposal deviates for hypot. Why are you willing to deviate from the C convention for hypot but not for log1p and expm1? (Also, while the standard lists all these functions as optional and doesn't specify their names, there are a good many non-optional parts of that standard that we don't expose; for example we have no copysign function.)
Regarding levels of familiarity and placement: we're probably going to have to just agree to disagree on whether expm1 is well-known as acosh, because I don't see how to settle that objectively. Looking at the scientific calculators that students are typically assigned to buy is an interesting idea, but (i) I just pulled out my own scientific calculator, a CASIO fx-115 ES, and it doesn't have these functions, at least not in an obvious way, and (ii) such calculators typically have a ton of functions (binomial coefficients, random numbers, statistical functions, complex functions) that I assume you agree are don't belong in System.Math. Would you agree that there is some level of advanced-ness that should move a function to another place? For example, would you agree that erf and erfc should go into an AdvancedMath class in System.Numerics?
(I assume your digression about floating point precision was in response to my comment about the bad large-argument behavior of the .NET trig functions. It's not really relevant to this discussion, but there are plenty of platforms that do better a little to no additional cost. I don't know of any other platform that returns the argument as the value; most effectively return a random number between -1 and 1 and a few do fully accurate range reduction. At very least we should be throwing an ArgumentOutOfRangeException instead of returning an impossible value. I found this so annoying that, for my own work, I coded versions of sin and cos that usually do just one (occasionally 2) floating point comparison in order to decide whether to re-route to a slow routine that does fully accurate range reduction with a ~1100-bit value of pi. It guarantees results to 1 ulp and added only ~1% overhead in my benchmarks for typical inputs.)
Regarding implementation, here is (almost) the 5-ulp version of log1p:
public static double LogOnePlus (double x) {
double xp1 = 1.0 + x;
if (xp1 == x) {
return(x);
} else {
return(x * Math.Log(xp1) / (xp1 - 1.0);
}
}
That's one comparison and two flops more than Math.Log(1.0 + x) - 1.0, (Handling non-positives and near-overflow values does add another comparison or two.) I don't deny that a native implementation would be faster, but that's hardly so great an overhead as to make this implementation useless. I use a version like this in my own work all the time, usually called inside functions that are consuming many 1000s of flops to compute an advanced function, and that overhead is totally negligible.
By the way, while I do enjoy the discussion of implementation details and other math function behavior, the key issues for me are naming and placement for expm1, log1p, and hypot, as indicated by the bolded questions. Feel free to ignore the rest.
Moving this out of the main proposal.
The following APIs are also defined by the C/C++ Standard Library and may be worth considering:
{
double frexp(double, int*); // Get significand and exponent
double ldexp(double, int); // Generate value from significand and exponent
double modf(double, double*); // Break into fractional and integral parts
double erf(double); // Compute error function
double erfc(double); // Compute complementary error function
double tgamma(double); // Compute gamma function
double lgamma(double); // Compute log-gamma function
double fmod(double, double); // Compute remainder of division
double remquo(double, double, int*); // Compute remainder and quotient
double nextafter(double, double); // Get next representable value
double fdim(double, double); // Compute positive difference
double fma(double, double, double); // Multiply-Add
}
{
float frexp(float, int*); // Get significand and exponent
float ldexp(float, int); // Generate value from significand and exponent
float modf(float, float*); // Break into fractional and integral parts
float erf(float); // Compute error function
float erfc(float); // Compute complementary error function
float tgamma(float); // Compute gamma function
float lgamma(float); // Compute log-gamma function
float fmod(float, float); // Compute remainder of division
float remquo(float, float, int*); // Compute remainder and quotient
float nextafter(float, float); // Get next representable value
float fdim(float, float); // Compute positive difference
float fma(float, float, float); // Multiply-Add
}
Updated: Remove IsNormal
, IsFinite
, and Classify
, as they were already reviewed elsewhere (with IsNormal
and IsFinite
having been added).
Several of the C++ functions I mentioned above are already being handled in other proposals.
@mellinoe, any chance we could move this to 'API ready for review'?
I think the only issue that's come up is the naming (which I'm not particular on).
@mellinoe, any chance we could move this to 'API ready for review'?
@tannergooding Does the first post contain the additions you'd like to move forward with? Those look ok to me.
Yes. The first post was cleaned up to contains just the APIs I believe we should move forward with.
@tannergooding Okay -- looks reasonable. I've marked it ready for review.
I can't wait to use these!
API review:
We removed few methods after discussion. We suggest to move them into separate library and experiment.
Approved API:
public static partial class Math
{
// Hyperbolic Functions
public static double Acosh(double); // Compute area hyperbolic cosine
public static double Asinh(double); // Compute area hyperbolic sine
public static double Atanh(double); // Compute area hyperbolic tangent
// Power Functions
public static double Cbrt(double); // Compute cubic root
}
public static partial class MathF
{
// Hyperbolic Functions
public static float Acosh(float); // Compute area hyperbolic cosine
public static float Asinh(float); // Compute area hyperbolic sine
public static float Atanh(float); // Compute area hyperbolic tangent
// Power Functions
public static float Cbrt(float); // Compute cubic root
}
I'm going to start working on this after https://github.com/dotnet/coreclr/pull/14119 goes in.
FYI: The API review discussion was recorded - see https://youtu.be/W_r6oG7nnK4?t=1723 (40 min duration)
Just as an FYI, since it was brought up in the API review (especially around naming and where they go), the names are not only just used in the C Runtime and elsewhere, but are also defined with that name in the IEEE spec (not that it makes too much difference).
The IEEE defined operations are:
Updated the top post to list the approved APIs and the APIs which were not approved at this time.
This needs exposing in the contracts.
Contract and tests are up in dotnet/corefx#26035
Rationale
The .NET framework does not currently provide support for several of the mathematical functions that are available to the C/C++ Standard Library.
Support should be provided for these mathematical functions in order to better interop with high-performance, scientific, and multimedia-based applications where these functions may be in demand.
Several of these functions, such as
ExpM1
, may be more accurate or may return values for inputs outside the range of the corresponding code written manually (EXP(x) - 1
).Approved API
Unapproved APIs
The following APIs were also reviewed but rejected at this time. The discussion ended up that these would likely be better suited in a
MathExtensions
or similar class provided through a separate library (and likely implemented using the hardware intrinsics functionality rather than as FCALLs in the runtime).Additional Details
This will require several changes in the CoreCLR as well to support the new APIs via FCALLs and Intrinsics.