duckduckgo / zeroclickinfo-goodies

DuckDuckGo Instant Answers based on Perl & JavaScript
https://duckduckhack.com/
Other
979 stars 1.76k forks source link

PrimeFactors: Use a faster backend for Math::Prime::Util #382

Closed jagtalon closed 7 years ago

jagtalon commented 10 years ago

Math::Prime::Util::GMP is said to be a lot faster in computing factors than the pure Perl implementation in Math::Prime::Util. See how feasible this is, and if installing GMP on people's systems is difficult (since people with DuckPAN will be using it on Linux and OS X), move it to another repository like https://github.com/duckduckgo/zeroclickinfo-goodie-qrcode.

Make sure to add install instructions there, too.

Update: We're now accepting external libraries (such as https://github.com/duckduckgo/zeroclickinfo-goodies/blob/master/lib/DDG/Goodie/ChineseZodiac.pm#L5 which depends on https://duckduckgo.com/?q=mpfr). So using a XS version of Math::Prime::Util is totally fine.


IA Page: https://duck.co/ia/view/prime_factors

danaj commented 10 years ago

Math::Pari is another alternative. It has its own set of issues that are different than MPU. The factoring performance is pretty good -- not too dissimilar from MPU::GMP.

I like the alarm idea -- factoring is a hard problem of course, so given a large enough semiprime there isn't going to be any fast solution.

The MPU::GMP module is certainly much faster than the native Perl code. Using an alternate Math::BigInt backend helps a lot (e.g. Math::BigInt::GMP or Math::BigInt::Pari) but it's still not great. It lets one factor ~40 digit numbers acceptably without having real math libraries installed. There is no QS code, and even the ECM is really slow using Math::BigInt.

I've thought about using Devel::CheckLib in Math::Prime::Util to test for GMP and add MPU::GMP to the prereqs if it is found. This can be done similar to the test in MPU::GMP's Makefile.PL but using check_lib instead of check_lib_or_exit.

E.g. 10^40+20: 2.8s MPU with Calc backend 0.4s MPU with Pari backend 0.3s MPU with GMP backend 0.04s MPU with Math::Prime::Util::GMP 0.02s Math::Pari factorint

10^50+10^20+3: over 4 minutes MPU using Math::BigInt and any back end 0.58s Math::Pari 0.15s MPU with Math::Prime::Util::GMP

93046990852932397871586367727253564961: ~70min (!!) MPU with Calc backend ~4min MPU with Pari backend ~2min MPU with GMP backend 0.15s MPU with Math::Prime::Util::GMP 0.04s Math::Pari

One solution would be to make a self-contained libtommath solution, for people who absolutely cannot use GMP. MPU already has 3 implementations (C, Perl, C+GMP) of most functions -- often with multiple branches, so adding a fourth major implementation isn't good for testing.

jagtalon commented 10 years ago

@danaj Thanks for this! Devel::CheckLib sounds like a great idea, although we might run into problems because it looks like some functions such as factor_exp exist in Math::Prime::Util, but not in Math::Prime::Util::GMP. But I think people can re-create the functionality of factor_exp once we get all the prime factors anyway.

I saw your blog that compares all the different math libraries--definitely a good read. For those who are interested, it's http://blogs.perl.org/users/dana_jacobsen/2013/04/factoring-integers-in-perl.html

danaj commented 10 years ago

Math::Prime::Util::GMP is meant as a back-end to Math::Prime::Util, rather than an alternative. MPU checks for its existance and calls it (sometimes directly from the XS code) if needed. So there should be no need to add additional code. Just install Math::Prime::Util::GMP if possible.

I've decided that the BigInt issue is enough of a bother here and elsewhere that I will start working on a Math::LTM module using LibTomMath. I can't promise anything, and there are the obvious dangers of solving a problem by making a n+1'th solution. However, this should give a standalone (no FTP, no system libraries) CPAN module that does basic bigint math at reasonably fast speeds. I would expect it to be much faster than Math::BigInt for most operations, with functionality similar to Math::GMP as well as the critical bmodpow function. It won't match GMP performance for big operations, but that's ok. The PP parts of MPU would then get converted to use that rather than Math::BigInt.

That should fix some of the speed issues without exchanging them for portability (Math::GMPz or Math::Pari). One downside is that it would end the ability to run truly in pure Perl.

jagtalon commented 10 years ago

Thanks, @danaj! We updated DuckPAN to not fail on missing dependencies, so it's fine to have external libs now (because devs don't have to install it if they don't want to). Updated my statement above. :)

zekiel commented 8 years ago

I'm not sure if this is still a suggestion or (in the many PRs since) irrelephant now? :)

GuiltyDolphin commented 8 years ago

@zekiel Irrelephant? Oh boy :grinning: It looks like we're still using the same package, so I think it is still relephant.

pjhampton commented 7 years ago

Hey all

DuckDuckHack is now in Maintenance Mode and from now on, we are only accepting issues and PRs for essential bugs and bug fixes.

Unfortunately, with the above in mind, this isn't something we can action and will be closed.

We appreciate you taking the time to contribute and apologize for not being able to triage this issue.