elves / elvish

Powerful scripting language & versatile interactive shell
https://elv.sh/
BSD 2-Clause "Simplified" License
5.7k stars 300 forks source link

Pour go's math functions into a `math:` module? #727

Closed kwshi closed 4 years ago

kwshi commented 6 years ago

In the spirit of #576, should we also consider doing the same thing with some math functions?

This would benefit some people like me, who would like to open up terminals to do quick calculations without having to also type in an ipython each time I want to do so.

kwshi commented 6 years ago

It is not at all essential that this be part of the main elvish repository. Regardless of whether we end up wanting to have math be part of the standard elvish library, it is always possible to write an external math module. But it's worth considering, anyway.

xiaq commented 6 years ago

Before Go supports dynamic linking on macOS, it is much easier to implement this inside Elvish's codebase than outside. The main concern is how big this will inflate the binary.

kwshi commented 6 years ago

Aha. Sounds like something I'd be interested in doing...

kwshi commented 6 years ago

There is one downside to implementing math functions directly in Elvish, which is that performance would be, I imagine, several orders of magnitude poorer than a native Go implementation, since several computations (e.g. numerical approximation algorithms) involve long chains of arithmetic operations, which, given Elvish's string-only design, entail quite a few back-and-forth string-number conversions. There is also more room for numerical inaccuracies, since the decimal-string encoding of numbers is an inexact representation, and multiple conversions can cause these rounding errors to accumulate.

Of course, an alternative solution is to write executables in a different language (e.g. Go, or Python), and write corresponding bindings in Elvish that call those executables through Elvish's own functional interface. But that involves more steps (with Go or any other compiled language, there will need to be an extra compilation step that epm:install currently cannot handle), and in some sense a less-clean solution (for example, non-elvish libraries imported with use will have to be entirely reloaded every time a function is called, or each time such an external library is loaded a new process would have to be kept open).

kwshi commented 6 years ago

The main motivation behind me suggesting that we include all these libraries is that, in a way, I've kind of come to anticipate Elvish becoming some sort of a Python replacement--it's incredibly concise/terse while still being expressive, and in some ways, it's even out-flexing Python (in particular, closure and functional programming support, maybe?).

Here's a number of things I once would do in Python that I would now much more gladly do in Elvish (that I wouldn't as gladly have done in Bash, because of Bash's language eccentricities):

Here are some things I'd like to do in Elvish instead of Python, but can't quite yet because Elvish is still young:

Anyway, if Elvish is to become almost a sort of Python replacement, then I would also envision it assembling a "standard library" whose scale is comparable to that of Python's. But, of course, that may not be everyone's vision of Elvish. But those are my thoughts, anyway.

krader1961 commented 4 years ago

Note that a math: namespace was introduced by commit 2d8a045cf two months ago. I have created PR #925 and PR #930 to expose most of the Go math functions likely to be used by an Elvish program. More changes of that nature, such as adding the hyperbolic trigonometric functions, are trivial to implement and can be done if there is sufficient demand.

~/projects/3rd-party/elvish> put $math:pi
▶ (float64 3.141592653589793)
~/projects/3rd-party/elvish> math:sin $math:pi
▶ (float64 1.2246467991473515e-16)
~/projects/3rd-party/elvish> math:cos $math:pi
▶ (float64 -1)
~/projects/3rd-party/elvish> math:tan $math:pi
▶ (float64 -0.00000000000000012246467991473515)

There are, obviously, some quirks due to the use of IEEE 754 representation and the conversion of float64 values to strings which round to sixteen significant digits. Whether those aspects of this issue should be, or must be, improved is an open question. I think the answer is "no" since Elvish isn't intended to be used for high performance, high precision, numerical computations.

krader1961 commented 4 years ago

There are quite a few potentially useful math functions that are still not accessible in elvish. For example, the hyperbolic trigonometric functions such as tanh. But my PR #937 implemented the functions most likely to be used in an Elvish program. And @xiaq's earlier changes to introduce a float64 data type should minimize the cost of doing complex chains of math operations. Adding other math functions, as well as better support for rational numbers and other arbitrary precision numeric data types, should be done under separate issues.

My feeling is that this issue has been resolved and it should be closed. I'll go ahead and create pull-requests for the other trivial to implement Go math functions since we're already paying the cost to use that Go runtime code. But I don't see any reason for predicating closing this issue on those PRs.

krader1961 commented 4 years ago

Reviewing the Go math package documentation there are only two groups of functions that are still missing from Elvish that should probably be implemented. The first includes the inverse trigonometric functions (e.g., arccosine). Those are trivial to add in the same vein as the hyperbolic trigonometric functions so I'll do that tonight. The second group contains math.Max() and math.Min(). Here I think a more Elvish implementation that works on an arbitrary sequence of numbers is preferable to exposing the Go math functions directly. I'll tackle that as well.

All of the other math functions are extremely unlikely to ever be used in an Elvish program; e.g., the Bessel functions. I can't see any justification for predicating closing this issue on implementing support for those functions.

hanche commented 4 years ago

For max and min, I suggest you make max with zero arguments return -Inf, while min with zero arguments returns +Inf. For reference, here is max implemented in elvish:

fn max [@xs]{
  local:max = -Inf
  for x $xs { if (> $x $max) { max = $x } }
  put $max
}
krader1961 commented 4 years ago

@hanche, I agree min and max should return something rather than throwing an exception when handed an empty sequence or no args. The question is whether Inf or NaN makes the most sense. Which is more useful? Which is less likely to result in unexpected results in downstream calculations? Which is more consistent with analogous situations such as computing the logarithm of a negative number (which returns NaN)? I'm inclined to go with +/-Inf but perhaps this needs some discussion.

hanche commented 4 years ago

@krader1961: Here is one reason for preferring ±Inf: Composability. You really want these two commands to produce the same result:

max [(max $as) (max $bs)]
max [$@as $@bs]

If either of those two lists is empty, resulting in a NaN, then the first one will produce a NaN while the other might not. Unless, of course, you decide to ignore any NaN in the input list, but that violates the basic principle that NaNs are infectious.

I really can't think of any situation where returning ±Inf would create a problem while a NaN would not. In particular, code that naively assume you always get proper numbers will most likely choke on both.

You mention logarithms. I think log 0 should return -Inf, while the logarithm of a negative number should return NaN. At least, until we decide to support complex numbers, at which point we get to have some fun deciding about branch cuts and the like. (If we do, then an old paper by Kahan goes into great detail about how to do so. I mention it here only for possible future reference. Last I looked, which was years ago, python got branch cuts shockingly wrong.) In the same vein, exp -Inf should return 0, while log +Inf and exp +Inf should both return +Inf.

Edit: Checking out the newish math module, I see that the log function behaves just as I said above it should. So I didn't really need to argue for that.

krader1961 commented 4 years ago

@hanche Yes, my comment about log functions was more about what is the appropriate output for invalid input. In this case the invalid input is a missing or empty sequence. What does it mean to compute the minimum of an empty sequence? I did some googling and while there is disagreement the general consensus appears to be your position for +/-Inf. For example, https://www.econjobrumors.com/topic/what-is-the-maximum-of-the-empty-set. So that's what I've implemented. Plus it was simpler to implement than the NaN solution :smile:

hanche commented 4 years ago

@krader1961 Yeah, I did not want to go down the rabbit hole of sup-vs-max / inf-vs-min. The math fundamentalists (of which I admit to being one) are right that using infinities is right for sup and inf, but not, strictly speaking, for max and min, which in the world of mathematics requires the max or min value to be actually achieved within the set. But the world of programming is slightly different, and so different rationales apply. Still, one should strive for a level of mathematical consistency, even if the conventions are a bit different.

xiaq commented 4 years ago

I prefer math:min and math:max to throw an error when given no arguments. Elvish may get a Scheme-like numerical tower in future, and in such a system max and min should require at least one argument (Racket's max for example), which is is a consequence of two things:

Since max and min work for both inexact and exact numbers, it should not return -Inf or +Inf when given no arguments.

If Elvish ends up not getting a Scheme-like numerical tower, it's a backward-compatible change to start allowing empty sequence and return -Inf or +Inf.

xiaq commented 4 years ago

Since most useful math functions have been poured into the math: module, I arbitrarily declare that this issue should be closed when the discussion of min and max is resolved.

hanche commented 4 years ago

Those who so wish, can easily override the builtin functions to produce infinities on empty inputs, so I am not going to insist. At a later point we could, of course, provide “exact” infinities, but that is likely not worth the bother. (Especially since the prime (pun not intended, but appreciated) use of bignums is in cryptography, which has no use for infinities that I am aware of.)