vermaseren / form

The FORM project for symbolic manipulation of very big expressions
GNU General Public License v3.0
1.15k stars 136 forks source link

Performance comparison: polynomial factorization (an univariate case) #189

Open tueda opened 7 years ago

tueda commented 7 years ago

Maybe this is just due to differences on algorithms, but it may be interesting to think about the reason why there is a big performance difference on polynomial factorization among programs in the market. Here is a benchmark test to factorize x^385-1.

FORM

S x;
#$f = x^385 - 1;
#factdollar $f
LF F =
  #do i=1,`$f[0]'
    #if `i' >= 2
      *
    #endif
    (`$f[`i']')
  #enddo
;
Format nospaces;
P;
.end
FORM 4.1 (May 24 2017, v4.1-20131025-346-gcf71752) 64-bits  Run: Wed May 31 23:45:22 2017

   F=
       (-1+x)
      *(1-x+x^5-x^6+x^7-x^8+x^10-x^11+x^12-x^13+x^14-x^16+x^17-x^18+
         x^19-x^23+x^24)
      *(1-x+x^5-x^6+x^10-x^12+x^15-x^17+x^20-x^23+x^25-x^28+x^30-x^34+
         x^35-x^39+x^40)
      *(1-x+x^7-x^8+x^11-x^12+x^14-x^15+x^18-x^19+x^21-x^23+x^25-x^26+
         x^28-x^30+x^32-x^34+x^35-x^37+x^39-x^41+x^42-x^45+x^46-x^48+
         x^49-x^52+x^53-x^59+x^60)
      *(1+x+x^2+x^3+x^4)
      *(1+x+x^2+x^3+x^4+x^5+x^6)
      *(1+x+x^2+x^3+x^4+x^5+x^6+x^7+x^8+x^9+x^10)
      *(1+x+x^2+x^3+x^4-x^7-x^8-x^9-x^10-2*x^11-x^12-x^13-x^14-x^15+
         x^18+x^19+x^20+x^21+x^22+x^35+x^36+x^37+x^38+x^39-x^42-x^43-
         x^44-x^45-2*x^46-x^47-x^48-x^49-x^50+x^53+x^54+2*x^55+2*x^56+2
         *x^57+x^58+x^59-x^62-x^63-x^64-x^65-2*x^66-x^67-x^68-x^69+x^71
         +x^72+2*x^73+2*x^74+x^75+x^76+x^77-x^81-x^82-x^83-2*x^84-2*
         x^85-x^86-x^87-x^88+x^90+x^91+x^92+x^93+x^94+x^95+x^96-x^100-2
         *x^101-x^102-x^103-x^104+x^106+x^107+2*x^108+2*x^109+2*x^110+2
         *x^111+2*x^112+x^113+x^114-x^116-2*x^117-2*x^118-3*x^119-3*
         x^120-3*x^121-2*x^122-2*x^123-x^124+x^126+x^127+2*x^128+2*
         x^129+2*x^130+2*x^131+2*x^132+x^133+x^134-x^136-x^137-x^138-2*
         x^139-x^140+x^144+x^145+x^146+x^147+x^148+x^149+x^150-x^152-
         x^153-x^154-2*x^155-2*x^156-x^157-x^158-x^159+x^163+x^164+
         x^165+2*x^166+2*x^167+x^168+x^169-x^171-x^172-x^173-2*x^174-
         x^175-x^176-x^177-x^178+x^181+x^182+2*x^183+2*x^184+2*x^185+
         x^186+x^187-x^190-x^191-x^192-x^193-2*x^194-x^195-x^196-x^197-
         x^198+x^201+x^202+x^203+x^204+x^205+x^218+x^219+x^220+x^221+
         x^222-x^225-x^226-x^227-x^228-2*x^229-x^230-x^231-x^232-x^233+
         x^236+x^237+x^238+x^239+x^240);

real    0m20.503s
user    0m20.495s
sys     0m0.017s

GiNaC

factor(x^385-1);
exit;
ginsh - GiNaC Interactive Shell (GiNaC V1.7.2)
  __,  _______  Copyright (C) 1999-2017 Johannes Gutenberg University Mainz,
 (__) *       | Germany.  This is free software with ABSOLUTELY NO WARRANTY.
  ._) i N a C | You are welcome to redistribute it under certain conditions.
<-------------' For details type `warranty;'.

Type ?? for a list of help topics.
(1+x^2+x+x^3+x^4+x^5+x^6)*(1+x^9+x^2+x+x^3+x^8+x^10+x^4+x^5+x^6+x^7)*(-1+2*x^194-2*x^56-2*x^111-x^220+x^153+x^124+x^9-x^35+x^69+x^136-x^205+x^159-2*x^185-x^114+x^226-x^2-x^127-x-x^72-x^237+2*x^11+x^43+2*x^85-x^168+x^15-x^59+2*x^117+x^232-x^38-x^75-x^148+x^191+x^88+2*x^174-x^133+x^13+2*x^101+x^197-x^20+x^154+2*x^139-x^3+2*x^46-x^91-x^165+x^104+2*x^229+x^81-x^94-x^186-x^54-x^107-x^145+3*x^120-x^238+x^65-2*x^128+x^171+x^49+x^192-2*x^110-x^218+x^177-x^203+x^62+2*x^123+x^68-x^134+x^157+x^100+x^198-2*x^183-2*x^57-x^113-x^36-x^71+x^140-x^126+2*x^84-2*x^166+x^116+x^230+x^8-x^163-x^19-2*x^74-x^146-2*x^131+x^44+x^87+x^172+x^195+x^64-x^221+x^10-x^39-x^77+x^152-x^90+x^178+x^137+x^103-x^204+x^158+x^12+x^47-x^93-2*x^184+x^227-x^106+3*x^119-x^236-x^169+x^233-x^149-x^96+x^190+x^175+x^14-2*x^55-2*x^109+x^67-2*x^132-x^201+2*x^122+x^50+x^196+2*x^155-2*x^112-x^222+x^63-x^18+x^138-x^181+x^42+x^83-x^164-x^58+x^228-x^4-x^187-x^37-2*x^73-x^144-x^239-2*x^129-x^22+x^86+x^193+2*x^118-x^219-x^76-x^150+x^45+x^176+x^102-x^202+2*x^156-x^92-x^182+x^225-2*x^167-x^53+x^231-x^21+x^82-x^147+x^48-x^95+x^7-2*x^108+2*x^66-2*x^130+x^173+3*x^121-x^240)*(1+x^2+x+x^3+x^4)*(-1+x)*(1+x^25+x^35-x+x^11-x^15-x^59+x^46-x^41+x^49+x^28-x^8-x^19+x^39-x^23-x^52-x^12+x^60+x^14-x^34+x^18+x^42-x^37-x^30+x^32-x^45-x^26+x^53+x^21-x^48+x^7)*(1+x^25+x^35-x+x^15+x^20-x^28+x^10-x^39-x^23-x^12-x^34+x^30+x^5+x^40-x^6-x^17)*(-1+x+x^11+x^16+x^13-x^24+x^8-x^19-x^10+x^23-x^12-x^14+x^18-x^5+x^6-x^7-x^17)

real    0m28.743s
user    0m28.751s
sys     0m0.005s

Mathematica

$Version // Print;
Factor[x^385-1] // Print;
11.0.1 for Linux x86 (64-bit) (September 21, 2016)
(-1 + x)*(1 + x + x^2 + x^3 + x^4)*(1 + x + x^2 + x^3 + x^4 + x^5 + x^6)*(1 + x + x^2 + x^3 + x^4 + x^5 + x^6 + x^7 + x^8 + x^9 + x^10)*(1 - x + x^5 - x^6 + x^7 - x^8 + x^10 - x^11 + x^12 - x^13 + x^14 - x^16 + x^17 - x^18 + x^19 - x^23 + x^24)*(1 - x + x^5 - x^6 + x^10 - x^12 + x^15 - x^17 + x^20 - x^23 + x^25 - x^28 + x^30 - x^34 + x^35 - x^39 + x^40)*(1 - x + x^7 - x^8 + x^11 - x^12 + x^14 - x^15 + x^18 - x^19 + x^21 - x^23 + x^25 - x^26 + x^28 - x^30 + x^32 - x^34 + x^35 - x^37 + x^39 - x^41 + x^42 - x^45 + x^46 - x^48 + x^49 - x^52 + x^53 - x^59 + x^60)*(1 + x + x^2 + x^3 + x^4 - x^7 - x^8 - x^9 - x^10 - 2*x^11 - x^12 - x^13 - x^14 - x^15 + x^18 + x^19 + x^20 + x^21 + x^22 + x^35 + x^36 + x^37 + x^38 + x^39 - x^42 - x^43 - x^44 - x^45 - 2*x^46 - x^47 - x^48 - x^49 - x^50 + x^53 + x^54 + 2*x^55 + 2*x^56 + 2*x^57 + x^58 + x^59 - x^62 - x^63 - x^64 - x^65 - 2*x^66 - x^67 - x^68 - x^69 + x^71 + x^72 + 2*x^73 + 2*x^74 + x^75 + x^76 + x^77 - x^81 - x^82 - x^83 - 2*x^84 - 2*x^85 - x^86 - x^87 - x^88 + x^90 + x^91 + x^92 + x^93 + x^94 + x^95 + x^96 - x^100 - 2*x^101 - x^102 - x^103 - x^104 + x^106 + x^107 + 2*x^108 + 2*x^109 + 2*x^110 + 2*x^111 + 2*x^112 + x^113 + x^114 - x^116 - 2*x^117 - 2*x^118 - 3*x^119 - 3*x^120 - 3*x^121 - 2*x^122 - 2*x^123 - x^124 + x^126 + x^127 + 2*x^128 + 2*x^129 + 2*x^130 + 2*x^131 + 2*x^132 + x^133 + x^134 - x^136 - x^137 - x^138 - 2*x^139 - x^140 + x^144 + x^145 + x^146 + x^147 + x^148 + x^149 + x^150 - x^152 - x^153 - x^154 - 2*x^155 - 2*x^156 - x^157 - x^158 - x^159 + x^163 + x^164 + x^165 + 2*x^166 + 2*x^167 + x^168 + x^169 - x^171 - x^172 - x^173 - 2*x^174 - x^175 - x^176 - x^177 - x^178 + x^181 + x^182 + 2*x^183 + 2*x^184 + 2*x^185 + x^186 + x^187 - x^190 - x^191 - x^192 - x^193 - 2*x^194 - x^195 - x^196 - x^197 - x^198 + x^201 + x^202 + x^203 + x^204 + x^205 + x^218 + x^219 + x^220 + x^221 + x^222 - x^225 - x^226 - x^227 - x^228 - 2*x^229 - x^230 - x^231 - x^232 - x^233 + x^236 + x^237 + x^238 + x^239 + x^240)

real    0m0.663s
user    0m0.436s
sys     0m0.087s

Note that Mathematica takes ~ 0.6s for the startup.

Summary

FORM GiNaC Mathematica
Elapsed time 20.5s 28.7s 0.7s
Ratio to Mathematica 29.3 41.0 1.0
apik commented 7 years ago

One more example:

Mathematica: echo "Factor[x^385-1] // Print;" | math11.0

real    0m0.648s
user    0m0.379s
sys 0m0.137s

Pari/GP: echo "factor(x^385-1)" |gp -q -f

real    0m0.058s
user    0m0.050s
sys 0m0.008s
vermaseren commented 7 years ago

This is and example of the fact that for FORM we have only limited manpower for these kind of things. If you have a mathematician in the team who is specialized in these things, that part of the program will be better. If you have enough people you can built in more heuristics. I take it that x^n-1 for large values of n can be categorized as such. In view of the lack of much manpower, the design in FORM was to have it work as good as possible for the things we run into most of the time and not so much extended (semi)artificial benchmarks. Jan Kuipers told me once that the code in mathematica for Groebner basis was longer than the whole of FORM. We were at that time trying to see whether we could program Groebner bases and ran into similar things that on special examples mathematica was much faster. It just meant that they were not just following the regular algorithms, but had programmed lots of ‘shortcuts’.

Of course, if we see a way to improve algorithms, we will. And it would be really nice if somebody has a nice package that could be fit in in the form of a library. (like GMP and zlib). Till then, the factorization is just there to avoid that you have to call external programs with all the mess involved. It works, but unfortunately it is not the best.

I hope this explains the situation.

Jos

On 1 jun. 2017, at 12:02, Andrey Pikelner notifications@github.com wrote:

One more example:

Mathematica: echo "Factor[x^385-1] // Print;" | math11.0

real 0m0.648s user 0m0.379s sys 0m0.137s Pari/GP: echo "factor(x^385-1)" |gp -q -f

real 0m0.058s user 0m0.050s sys 0m0.008s — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/vermaseren/form/issues/189#issuecomment-305448543, or mute the thread https://github.com/notifications/unsubscribe-auth/AFLxEmk87sgK9rhH5enI-TVg_JGkWKhjks5r_ow2gaJpZM4NsSHY.

tueda commented 7 years ago

I guess now the problem here is not in heuristics but sounds just Berlekamp vs. Cantor-Zassenhaus, but agree that's the lack of manpower.

tueda commented 7 years ago

I found how to (roughly) factorize x^n - 1. It's a heuristic.

https://www.math.iupui.edu/~ccowen/OldCoursePages/Math300F07/300F07Quiz4.pdf