python / cpython

The Python programming language
https://www.python.org
Other
63.15k stars 30.23k forks source link

Use divide-and-conquer for faster factorials #52938

Closed cfc9f3e0-e33f-4ecd-9ddd-4123842d6c1e closed 14 years ago

cfc9f3e0-e33f-4ecd-9ddd-4123842d6c1e commented 14 years ago
BPO 8692
Nosy @rhettinger, @mdickinson, @abalkin
Files
  • factorial4.py
  • factorial-no-recursion.patch
  • factorial-test.patch: Improve unit tests for math.factorial
  • factorial-precompute-partials.patch
  • factorial-speed.sh: Script to test factorial speed for several n
  • factorial.py
  • factorial.patch: Divide-and-conquer factorial algorithm
  • factorial2.patch
  • factorial3.patch
  • factorial4.patch
  • unnamed
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields: ```python assignee = 'https://github.com/mdickinson' closed_at = created_at = labels = ['extension-modules', 'performance'] title = 'Use divide-and-conquer for faster factorials' updated_at = user = 'https://bugs.python.org/stutzbach' ``` bugs.python.org fields: ```python activity = actor = 'mark.dickinson' assignee = 'mark.dickinson' closed = True closed_date = closer = 'mark.dickinson' components = ['Extension Modules'] creation = creator = 'stutzbach' dependencies = [] files = ['17320', '17321', '17326', '17328', '17329', '17343', '17345', '17348', '17351', '17352', '17354'] hgrepos = [] issue_num = 8692 keywords = ['patch', 'needs review'] message_count = 72.0 messages = ['105537', '105548', '105551', '105552', '105553', '105557', '105559', '105561', '105562', '105563', '105564', '105565', '105568', '105579', '105590', '105592', '105593', '105594', '105596', '105597', '105600', '105602', '105603', '105604', '105607', '105610', '105653', '105655', '105657', '105660', '105661', '105663', '105664', '105665', '105666', '105668', '105678', '105679', '105680', '105681', '105682', '105683', '105684', '105689', '105691', '105715', '105729', '105731', '105754', '105756', '105757', '105759', '105760', '105770', '105777', '105782', '105783', '105784', '105797', '105801', '105806', '105807', '105808', '105811', '105812', '105813', '105814', '105815', '105816', '105817', '105818', '105851'] nosy_count = 6.0 nosy_names = ['rhettinger', 'mark.dickinson', 'belopolsky', 'draghuram', 'stutzbach', 'Alexander.Belopolsky'] pr_nums = [] priority = 'normal' resolution = 'accepted' stage = 'patch review' status = 'closed' superseder = None type = 'performance' url = 'https://bugs.python.org/issue8692' versions = ['Python 3.2'] ```

    mdickinson commented 14 years ago

    Daniel, sorry: I spent so long writing that last message that I didn't read your update until now.

    The new patch looks fine; same caveat about the overflow check as before. Let me know when you want me to apply this.

    mdickinson commented 14 years ago

    One other note: I find the bit numbering in find_last_set_bit peculiar: isn't the least significant bit usually bit 0? (Well, okay some people number the msb 0, but that's just weird. :) I know the ffs and fls functions also start their bit numbering at one, but this seems very unconventional to me.

    Perhaps rename find_last_set_bit to bit_length? i.e., it's the *size* of the small bitfield that can contain the given value, rather than the index of the highest bit in that bitfield.

    mdickinson commented 14 years ago

    it's the *size* of the small bitfield

    *smallest

    cfc9f3e0-e33f-4ecd-9ddd-4123842d6c1e commented 14 years ago

    On Fri, May 14, 2010 at 3:25 PM, Mark Dickinson \report@bugs.python.org\ wrote:

    The patch looks good.  Just one issue: I think the overflow check for num_operands * last_bit is bogus.  It's possible for the product to overflow and still end up being less than num_operands.  How about:

    You're right; I'm not sure what I was thinking. (actually, I'm pretty sure I was thinking about addition ;) )

    if (num_operands \<= BITS_IN_LONG && num_operands * last_m_bit \<= BITS_IN_LONG) ...

    instead?  If num_operands \<= BITS_IN_LONG then there's no danger of overflow in the multiplication, and if num_operands > BITS_IN_LONG then the second test will certainly fail.

    Yes, that looks good.

    Marking this as accepted:  Daniel, if you like I can apply the overflow check change suggested above and the minor style fixes and apply this;  or if you want to go another round and produce a third patch, that's fine too.  Let me know.

    That would be great. :)

    With regards to find_last_set_bit, I understand where you're coming from, but I'm following the convention of ffs/fls and also bits_in_digit() from longobject.c. I'd like to keep them interchangeable, in case they're consolidated at some point in the future. If you want to rename the function, that's fine with me.

    FWIW, I think the MSB=0th people are all electrical engineers. If you're serializing down to bits and the MSB is the first (0th) bit off the wire, it makes sense in that context.

    (N.B.  If you do produce another patch, please name it something different and don't remove the earlier version---removing the earlier patch versions makes life harder for anyone trying to follow this issue.)

    Good to know. I'll keep that in mind going forward.

    Thanks to both of you for the rigorous review. This was a fun patch to work on. :-)

    mdickinson commented 14 years ago

    Okay, here's what I'm planning to commit (tomorrow), in case anyone wants to give it one last check. I've added some comments half-explaining the algorithm used (just in case the referenced URL goes out of existence). I made a couple of other minor changes to the algorithm:

    (1) in factorial_partial_product, use 'k = (n + num_operands) | 1;' rather than 'k = (n + num_operands - 1) | 1;'. This saves an operation, and means that when a range including an odd number of terms is split then the bottom half gets the extra term, which makes the partial products a teensy bit more balanced, since the bottom half consists of smaller numbers.

    (2) in factorial_loop, I set the initial value of 'upper' to 3 rather than 1. This avoids factorial_partial_product ever being called with a start of 1.

    Apart from that, no significant changes.

    cfc9f3e0-e33f-4ecd-9ddd-4123842d6c1e commented 14 years ago

    Looks good.

    abalkin commented 14 years ago

    Mark, It is always a pleasure to read your algorithm descriptions!

    Just a few comments:

    I'll review math_factorial() after reading Daniel's message that just arrived.

    abalkin commented 14 years ago

    Some more ...

    mdickinson commented 14 years ago

    New patch (factorial3.patch) addressing all of Alexander's points except the one about including Python source somewhere.

    I also expanded the lookup table to 20 entries on LP64 systems.

    mdickinson commented 14 years ago

    And the same patch, but with a (deliberately simple) pure Python version of the algorithm in test_math.py.

    abalkin commented 14 years ago

    On Sat, May 15, 2010 at 6:32 AM, Mark Dickinson \report@bugs.python.org\ wrote:

    New patch (factorial3.patch) addressing all of Alexander's points except the one about including Python source somewhere.

    Thanks for making the changes. I think we converged to a really neat implementation.

    I also expanded the lookup table to 20 entries on LP64 systems.

    It's a matter of taste, but I was taught that C allows trailing commas in initializers specifically for the cases like this to avoid using a leading comma.

    mdickinson commented 14 years ago

    It's a matter of taste, but I was taught that C allows trailing commas in initializers specifically for the cases like this to avoid using a leading comma.

    Unfortunately, I think C89 doesn't allow for this, and there are tracker issues about Python compiles failing on some platforms (AIX?) thanks to extra trailing commas in the source. I prefer the extra comma too, but it's not legal C89.

    mdickinson commented 14 years ago

    Ah; Alexander's right: I was misremembering. An extra comma in an *enum* list isn't allowed (cf. bpo-5889); an extra comma in an array initializer is. I'll rewrite that bit.

    5579dc13-9f48-42d1-bb17-9c003ef6fa70 commented 14 years ago

    There is one place in the notes still referring to
    factorial_part_product.

    cfc9f3e0-e33f-4ecd-9ddd-4123842d6c1e commented 14 years ago

    The comment for bit_length is missing a space or two: "Objects/longobject.c.Someday"

    mdickinson commented 14 years ago

    There is one place in the notes still referring to factorial_part_product.

    Hmm. I can't find it. Can you be more specific?

    I'll fix the spaces before 'Someday'.

    5579dc13-9f48-42d1-bb17-9c003ef6fa70 commented 14 years ago

    Sorry for terseness. Sending it from my phone.

    The line was in factorial4.patch:

    + * The factorial_partial_product function computes the product of all
    odd j in

    >

    Hmm. I can't find it. Can you be more specific?

    mdickinson commented 14 years ago

    Okay, thanks. I'm still not seeing what's wrong with this, though (sorry for being slow :( )

    5579dc13-9f48-42d1-bb17-9c003ef6fa70 commented 14 years ago

    s/partial_product/odd_part/

    It looks like you made this change in some places but not all.

    On May 15, 2010, at 11:16 AM, Mark Dickinson \report@bugs.python.org\
    wrote:

    Mark Dickinson \dickinsm@gmail.com\ added the comment:

    Okay, thanks. I'm still not seeing what's wrong with this, though
    (sorry for being slow :( )

    ----------


    Python tracker \report@bugs.python.org\ \http://bugs.python.org/issue8692\


    mdickinson commented 14 years ago

    But factorial_partial_product and factorial_odd_part both exist: the former is just computing the product of all odd integers in the given interval, while the latter computes the odd part of factorial(n).

    I've double checked the comments and they still look okay to me. That particular reference really is to factorial_partial_product.

    5579dc13-9f48-42d1-bb17-9c003ef6fa70 commented 14 years ago

    Sorry I didn't realize that ...

    mdickinson commented 14 years ago

    Committed in r81196. Thanks, everyone!