Expand math.gcd() and math.lcm() to accept multiple arguments

df904468-1c52-4185-9a2c-8cb85010a5c1 commented 4 years ago

BPO	39648
Nosy	@tim-one, @rhettinger, @mdickinson, @stevendaprano, @serhiy-storchaka, @sweeneyde, @ananthan-123, @bernardosulzbach
PRs	python/cpython#18590 python/cpython#18604

^{Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.}

Show more details

GitHub fields: ```python assignee = None closed_at = created_at = labels = ['library', '3.9'] title = 'Expand math.gcd() and math.lcm() to accept multiple arguments' updated_at = user = 'https://github.com/ananthan-123' ``` bugs.python.org fields: ```python activity = actor = 'mark.dickinson' assignee = 'none' closed = True closed_date = closer = 'mark.dickinson' components = ['Library (Lib)'] creation = creator = 'Ananthakrishnan' dependencies = [] files = [] hgrepos = [] issue_num = 39648 keywords = ['patch'] message_count = 27.0 messages = ['362069', '362071', '362073', '362074', '362077', '362079', '362080', '362082', '362083', '362092', '362098', '362099', '362100', '362139', '362159', '362223', '362343', '362383', '362386', '362392', '362399', '362457', '362459', '362461', '362462', '362502', '362503'] nosy_count = 9.0 nosy_names = ['tim.peters', 'rhettinger', 'mark.dickinson', 'steven.daprano', 'SilentGhost', 'serhiy.storchaka', 'Dennis Sweeney', 'Ananthakrishnan', 'BernardoSulzbach'] pr_nums = ['18590', '18604'] priority = 'normal' resolution = None stage = 'resolved' status = 'closed' superseder = None type = None url = 'https://bugs.python.org/issue39648' versions = ['Python 3.9'] ```

df904468-1c52-4185-9a2c-8cb85010a5c1 commented 4 years ago

If we have to find the gcd of three or more numbers, now we should use gcd(a, gcd(b, gcd(c, gcd(d, e))))) which will create lot of problems.

math.gcd should take "n" number of arguments,like:

gcd(a,b,c,....)

gcd(4,6,8) //returns 2 gcd(2,5,8,6) //returns 1 gcd(6,30,40,60,20,40) //returns 2

serhiy-storchaka commented 4 years ago

What problems it will create?

df904468-1c52-4185-9a2c-8cb85010a5c1 commented 4 years ago

It will take more time to write the statement itself and there are chances of getting mistakes as the statement is complicated.

8977102d-e623-4805-b309-cd0ad35b0d72 commented 4 years ago

You could (should?) use reduce, then:

>>> functools.reduce(math.gcd, (6, 30, 40, 60, 20, 40))
2

rhettinger commented 4 years ago

+1 This isn't hard for us to do and may be occasionally useful.

serhiy-storchaka commented 4 years ago

Should it take variable number of positional arguments or a single iterable argument as max() and min()?

What should it return for 0 arguments?

serhiy-storchaka commented 4 years ago

Python as programming language provides builtin blocks. This is a good example of using functools.reduce().

But if you can provide an algorithm for calculating the GCD and LCM more efficient than sequential applying gcd() and lcm() for two arguments, it would be an argument for implementing it in the stdlib.

mdickinson commented 4 years ago

Should it take variable number of positional arguments or a single iterable argument as max() and min()?

Good question: the obvious extension to the current function would be to allow it to take multiple scalar arguments. But I don't much like the mismatch that introduces with sum and math.prod, which accept a single iterable argument.

I definitely don't want to give gcd a combination API like the one min and max have.

What should it return for 0 arguments?

That one's easy. It should return 0. 0 is an identity for the binary gcd operation, and mathematically, the gcd is defined for any set of integers (finite or infinite), including the empty set. The gcd of the empty set is zero.

mdickinson commented 4 years ago

What's slightly more interesting is the return value for a single argument a, where we should be careful to return abs(a) rather than simply a.

tim-one commented 4 years ago

This is almost all down to pragmatics for me.

For sum() and prod(), if there are only two operands then there are trivial other ways to spell that (+ and *). So it makes most sense for them to accept iterables instead. Essentially, e.g., nobody would ever _want_ to write sum(2, 3) instead of 2+3.

max() and min() differ in that there is no other simple way to write the 2-argument forms. max(x, y) _is_ commonly wanted - but so is max(iterable). It's a weird function signature, but nearly always does what's intended.

gcd() (and lcm()!) differ from both in that they're nearly always wanted with 2 arguments. "0, 1, or >= 3 arguments" are possibilities, but rarely wanted. It would be a PITA to need to write, e.g., gcd((x, y)) instead of gcd(x, y) for the overwhelmingly most common 2-argument case. So they should be *args functions - provided we want to cater directly to "0, 1, or >= 3 arguments" at all. Fine by me if we don't. I'm only +0 on generalizing beyond "exactly 2 arguments".

For the rest, I agree with Mark. In particular, gcd() must be 0.

Serhiy, a "bulk" gcd can be more efficient than nested function calls by exploiting a common case: it can get out early when the gcd-so-far becomes 1 (since gcd(1, anything) == 1, it doesn't matter what the remaining arguments are). For "random" integers, it's already the case that gcd(i, j) == 1 over 60% of the time.

sweeneyde commented 4 years ago

I think the behavior of gcd() == 0 is correct, but it should be documented, because it isn't completely obvious.

Arguments for gcd() == 0:

Preserves the invariant gcd(itertools.chain(iterables)) == gcd(itertools.starmap(gcd, iterables)) in the case that some iterables are empty.
Return a "neutral element" (not technically true for negative integers because gcd(0, a) == abs(a) != a, but it does get "ignored")

Other considerations:

The "common divisors" of the empty set are the integers d such that for all x in the empty set, d divides x, which is to say, all of the integers. Then there is no greatest of these common divisors. The same is true if we try gcd(0, 0, ... 0).
The above is only fixed if we interpret "greatest" in the sense of divisibility, where 0 > everything because everything divides 0. In particular, we can say that g is a greatest common divisor of a_1, ..., a_n if g | a_i for all i, and for any d with d | a_i for all i, we also have d | g. Using this definition also requires that we specify that if there are two gcds, we return the positive one. I believe this is the correct interpretation, but it may not be obvious, since it assumes that proof that such a number even exists.
Another common definition of the greatest common divisor of {a_1, ..., a_n} is the smallest positive integer expressible as an integer linear combination x_1 a_1 + ... + x_1 a_2, where x_i are integers. But this definition breaks on gcd(), gcd(0), gcd(0, 0), etc, since no positive integers are expressible.

In short, while I agree with the behavior, I think there should be a clarifying sentence in the docs saying "returns 0 if all of the arguments are 0".

sweeneyde commented 4 years ago

Correction: gcd(itertools.chain(iterables)) == gcd(*map(gcd, iterables))

sweeneyde commented 4 years ago

Correction correction: returning zero preserves the invariant below.

from math import gcd as GCD
from functools import reduce
from itertools import starmap, chain

def gcd(*args):
    return reduce(GCD, args, 0)

iterables = [[10, 20, 30],
             [],
             [0, 0, 0],
             [5],
             [-15, -10000000]]

a = gcd(*chain.from_iterable(iterables))
b = gcd(*starmap(gcd, iterables))
assert a == b == 5

df904468-1c52-4185-9a2c-8cb85010a5c1 commented 4 years ago

It should take variable number of positional arguments. it should return:

>math.gcd() 0 >math.gcd(8) 8 >math.gcd(-8) 8

6096c46d-67b5-4d7a-ac8a-7a7ad15f1d92 commented 4 years ago

I think this might make sense because gcd() could have some optimizations for n > 2, such as sorting the numbers and starting by the smallest elements.

However, I don't think gcd() and lcm() with more than two arguments are frequent use cases.

df904468-1c52-4185-9a2c-8cb85010a5c1 commented 4 years ago

Can I put together a PR for this issue?

mdickinson commented 4 years ago

@Ananthakrishnan: Sure, go ahead.

Let's make the process a bit more streamlined than last time around: see if you can put together a PR that passes all the CI checks before asking for review. If you get stuck, make a work-in-progress PR and ask for help in a comment on that PR rather than via email.

A couple of pointers:

Take a look at the existing math.hypot implementation to see how to deal with multiple arguments. (Note that argument clinic doesn't currently work for *args functions.)
For the first working version, don't bother making special cases for the zero-argument or one-argument gcd (or even the case of two arguments). You just want the equivalent of the following Python code, which is perfectly general:

    def gcd(*args):
        g = 0
        for arg in args:
            g = gcd_2arg(g, operator.index(arg))
        return g

where gcd_2arg is the original 2-argument gcd (in C, _PyLong_GCD). We can always optimise later.

df904468-1c52-4185-9a2c-8cb85010a5c1 commented 4 years ago

This is my code for math.gcd:

static PyObject *
math_gcd(PyObject *module, PyObject *args, Py_ssize_t n)
{
    PyObject *g = 0, *item, *in;

    Py_ssize_t i;
    for (i = 0; i < n; i++) {
        item = args[i];
        in = PyNumber_Index(item);
        if (in == NULL) {
            return NULL;
        }
        g = _PyLong_GCD(g, in);
    }
    Py_DECREF(in);
    return g;
}

This is showing an error:

error: incompatible types when assigning to type ‘PyObject {aka struct _object \}’ from type ‘PyObject {aka struct _object}’ item = args[i]; ^^^

mdickinson commented 4 years ago

@Ananthakrishnan: Do you know how arrays and pointers work in C?

df904468-1c52-4185-9a2c-8cb85010a5c1 commented 4 years ago

Yes I know.

df904468-1c52-4185-9a2c-8cb85010a5c1 commented 4 years ago

Thanks for the hint.Made changes.

serhiy-storchaka commented 4 years ago

Sorry, Ananthakrishnan, but I think this problem is too difficult to you. Adding math.lcm() taken 2 weeks and produced 200 messages. It is simpler to implement this feature myself to me.

df904468-1c52-4185-9a2c-8cb85010a5c1 commented 4 years ago

>Sorry, Ananthakrishnan, but I think this problem is too difficult to you. Adding math.lcm() taken 2 weeks and produced 200 messages. It is simpler to implement this feature myself to me.

I'm a beginner.Not everyone is perfect at begenning. I am starting this.So I took some problems myself as i saw the problems here are difficult.You should have given me a chance.

I think you should have at the same situations once,like the situations that I'm facing now.

serhiy-storchaka commented 4 years ago

I'm always ready to help a beginner, but this is the first time that it takes so much effort. It would be easier if you started with the simple problems you could handle. Even in simple cases you would be able to get tips that you can understand and gradually improve your skills.

serhiy-storchaka commented 4 years ago

If expand math.gcd() to accept multiple arguments, it is worth to do the same with math.lcm().

mdickinson commented 4 years ago

New changeset 559e7f165ad03731e6bc2211c0e6d8d9c02fb549 by Serhiy Storchaka in branch 'master': bpo-39648: Expand math.gcd() and math.lcm() to handle multiple arguments. (GH-18604) https://github.com/python/cpython/commit/559e7f165ad03731e6bc2211c0e6d8d9c02fb549

mdickinson commented 4 years ago

Merged; thank you to all involved.

python / cpython

Expand math.gcd() and math.lcm() to accept multiple arguments #83829