python / cpython

The Python programming language
https://www.python.org
Other
63.08k stars 30.21k forks source link

Expand math.gcd() and math.lcm() to accept multiple arguments #83829

Closed df904468-1c52-4185-9a2c-8cb85010a5c1 closed 4 years ago

df904468-1c52-4185-9a2c-8cb85010a5c1 commented 4 years ago
BPO 39648
Nosy @tim-one, @rhettinger, @mdickinson, @stevendaprano, @serhiy-storchaka, @sweeneyde, @ananthan-123, @bernardosulzbach
PRs
  • python/cpython#18590
  • python/cpython#18604
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields: ```python assignee = None closed_at = created_at = labels = ['library', '3.9'] title = 'Expand math.gcd() and math.lcm() to accept multiple arguments' updated_at = user = 'https://github.com/ananthan-123' ``` bugs.python.org fields: ```python activity = actor = 'mark.dickinson' assignee = 'none' closed = True closed_date = closer = 'mark.dickinson' components = ['Library (Lib)'] creation = creator = 'Ananthakrishnan' dependencies = [] files = [] hgrepos = [] issue_num = 39648 keywords = ['patch'] message_count = 27.0 messages = ['362069', '362071', '362073', '362074', '362077', '362079', '362080', '362082', '362083', '362092', '362098', '362099', '362100', '362139', '362159', '362223', '362343', '362383', '362386', '362392', '362399', '362457', '362459', '362461', '362462', '362502', '362503'] nosy_count = 9.0 nosy_names = ['tim.peters', 'rhettinger', 'mark.dickinson', 'steven.daprano', 'SilentGhost', 'serhiy.storchaka', 'Dennis Sweeney', 'Ananthakrishnan', 'BernardoSulzbach'] pr_nums = ['18590', '18604'] priority = 'normal' resolution = None stage = 'resolved' status = 'closed' superseder = None type = None url = 'https://bugs.python.org/issue39648' versions = ['Python 3.9'] ```

    df904468-1c52-4185-9a2c-8cb85010a5c1 commented 4 years ago

    If we have to find the gcd of three or more numbers, now we should use gcd(a, gcd(b, gcd(c, gcd(d, e))))) which will create lot of problems.

    math.gcd should take "n" number of arguments,like:

    gcd(a,b,c,....)

    gcd(4,6,8) //returns 2 gcd(2,5,8,6) //returns 1 gcd(6,30,40,60,20,40) //returns 2

    serhiy-storchaka commented 4 years ago

    What problems it will create?

    df904468-1c52-4185-9a2c-8cb85010a5c1 commented 4 years ago

    It will take more time to write the statement itself and there are chances of getting mistakes as the statement is complicated.

    8977102d-e623-4805-b309-cd0ad35b0d72 commented 4 years ago

    You could (should?) use reduce, then:

    >>> functools.reduce(math.gcd, (6, 30, 40, 60, 20, 40))
    2
    rhettinger commented 4 years ago

    +1 This isn't hard for us to do and may be occasionally useful.

    serhiy-storchaka commented 4 years ago

    Should it take variable number of positional arguments or a single iterable argument as max() and min()?

    What should it return for 0 arguments?

    serhiy-storchaka commented 4 years ago

    Python as programming language provides builtin blocks. This is a good example of using functools.reduce().

    But if you can provide an algorithm for calculating the GCD and LCM more efficient than sequential applying gcd() and lcm() for two arguments, it would be an argument for implementing it in the stdlib.

    mdickinson commented 4 years ago

    Should it take variable number of positional arguments or a single iterable argument as max() and min()?

    Good question: the obvious extension to the current function would be to allow it to take multiple scalar arguments. But I don't much like the mismatch that introduces with sum and math.prod, which accept a single iterable argument.

    I definitely don't want to give gcd a combination API like the one min and max have.

    What should it return for 0 arguments?

    That one's easy. It should return 0. 0 is an identity for the binary gcd operation, and mathematically, the gcd is defined for any set of integers (finite or infinite), including the empty set. The gcd of the empty set is zero.

    mdickinson commented 4 years ago

    What's slightly more interesting is the return value for a single argument a, where we should be careful to return abs(a) rather than simply a.

    tim-one commented 4 years ago

    This is almost all down to pragmatics for me.

    For sum() and prod(), if there are only two operands then there are trivial other ways to spell that (+ and *). So it makes most sense for them to accept iterables instead. Essentially, e.g., nobody would ever _want_ to write sum(2, 3) instead of 2+3.

    max() and min() differ in that there is no other simple way to write the 2-argument forms. max(x, y) _is_ commonly wanted - but so is max(iterable). It's a weird function signature, but nearly always does what's intended.

    gcd() (and lcm()!) differ from both in that they're nearly always wanted with 2 arguments. "0, 1, or >= 3 arguments" are possibilities, but rarely wanted. It would be a PITA to need to write, e.g., gcd((x, y)) instead of gcd(x, y) for the overwhelmingly most common 2-argument case. So they should be *args functions - provided we want to cater directly to "0, 1, or >= 3 arguments" at all. Fine by me if we don't. I'm only +0 on generalizing beyond "exactly 2 arguments".

    For the rest, I agree with Mark. In particular, gcd() must be 0.

    Serhiy, a "bulk" gcd can be more efficient than nested function calls by exploiting a common case: it can get out early when the gcd-so-far becomes 1 (since gcd(1, anything) == 1, it doesn't matter what the remaining arguments are). For "random" integers, it's already the case that gcd(i, j) == 1 over 60% of the time.

    sweeneyde commented 4 years ago

    I think the behavior of gcd() == 0 is correct, but it should be documented, because it isn't completely obvious.

    Arguments for gcd() == 0:

    Other considerations:

    In short, while I agree with the behavior, I think there should be a clarifying sentence in the docs saying "returns 0 if all of the arguments are 0".

    sweeneyde commented 4 years ago

    Correction: gcd(itertools.chain(iterables)) == gcd(*map(gcd, iterables))

    sweeneyde commented 4 years ago

    Correction correction: returning zero preserves the invariant below.

    from math import gcd as GCD
    from functools import reduce
    from itertools import starmap, chain
    
    def gcd(*args):
        return reduce(GCD, args, 0)
    
    iterables = [[10, 20, 30],
                 [],
                 [0, 0, 0],
                 [5],
                 [-15, -10000000]]
    
    a = gcd(*chain.from_iterable(iterables))
    b = gcd(*starmap(gcd, iterables))
    assert a == b == 5
    df904468-1c52-4185-9a2c-8cb85010a5c1 commented 4 years ago

    It should take variable number of positional arguments. it should return:

    >math.gcd() 0 >math.gcd(8) 8 >math.gcd(-8) 8

    6096c46d-67b5-4d7a-ac8a-7a7ad15f1d92 commented 4 years ago

    I think this might make sense because gcd() could have some optimizations for n > 2, such as sorting the numbers and starting by the smallest elements.

    However, I don't think gcd() and lcm() with more than two arguments are frequent use cases.

    df904468-1c52-4185-9a2c-8cb85010a5c1 commented 4 years ago

    Can I put together a PR for this issue?

    mdickinson commented 4 years ago

    @Ananthakrishnan: Sure, go ahead.

    Let's make the process a bit more streamlined than last time around: see if you can put together a PR that passes all the CI checks before asking for review. If you get stuck, make a work-in-progress PR and ask for help in a comment on that PR rather than via email.

    A couple of pointers:

        def gcd(*args):
            g = 0
            for arg in args:
                g = gcd_2arg(g, operator.index(arg))
            return g

    where gcd_2arg is the original 2-argument gcd (in C, _PyLong_GCD). We can always optimise later.

    df904468-1c52-4185-9a2c-8cb85010a5c1 commented 4 years ago

    This is my code for math.gcd:

    static PyObject *
    math_gcd(PyObject *module, PyObject *args, Py_ssize_t n)
    {
        PyObject *g = 0, *item, *in;
    
        Py_ssize_t i;
        for (i = 0; i < n; i++) {
            item = args[i];
            in = PyNumber_Index(item);
            if (in == NULL) {
                return NULL;
            }
            g = _PyLong_GCD(g, in);
        }
        Py_DECREF(in);
        return g;
    }

    This is showing an error:

    error: incompatible types when assigning to type ‘PyObject {aka struct _object \}’ from type ‘PyObject {aka struct _object}’ item = args[i]; ^^^

    mdickinson commented 4 years ago

    @Ananthakrishnan: Do you know how arrays and pointers work in C?

    df904468-1c52-4185-9a2c-8cb85010a5c1 commented 4 years ago

    Yes I know.

    df904468-1c52-4185-9a2c-8cb85010a5c1 commented 4 years ago

    Thanks for the hint.Made changes.

    serhiy-storchaka commented 4 years ago

    Sorry, Ananthakrishnan, but I think this problem is too difficult to you. Adding math.lcm() taken 2 weeks and produced 200 messages. It is simpler to implement this feature myself to me.

    df904468-1c52-4185-9a2c-8cb85010a5c1 commented 4 years ago

    >Sorry, Ananthakrishnan, but I think this problem is too difficult to you. Adding math.lcm() taken 2 weeks and produced 200 messages. It is simpler to implement this feature myself to me.

    I'm a beginner.Not everyone is perfect at begenning. I am starting this.So I took some problems myself as i saw the problems here are difficult.You should have given me a chance.

    I think you should have at the same situations once,like the situations that I'm facing now.

    serhiy-storchaka commented 4 years ago

    I'm always ready to help a beginner, but this is the first time that it takes so much effort. It would be easier if you started with the simple problems you could handle. Even in simple cases you would be able to get tips that you can understand and gradually improve your skills.

    serhiy-storchaka commented 4 years ago

    If expand math.gcd() to accept multiple arguments, it is worth to do the same with math.lcm().

    mdickinson commented 4 years ago

    New changeset 559e7f165ad03731e6bc2211c0e6d8d9c02fb549 by Serhiy Storchaka in branch 'master': bpo-39648: Expand math.gcd() and math.lcm() to handle multiple arguments. (GH-18604) https://github.com/python/cpython/commit/559e7f165ad03731e6bc2211c0e6d8d9c02fb549

    mdickinson commented 4 years ago

    Merged; thank you to all involved.