flintlib / flint

FLINT (Fast Library for Number Theory)
http://www.flintlib.org
GNU Lesser General Public License v3.0
444 stars 245 forks source link

fmpz_poly_factor slowness #907

Open fredrik-johansson opened 3 years ago

fredrik-johansson commented 3 years ago

This polynomial which came up in a calculation with roots of unity takes very long to factor with fmpz_poly_factor.

I expect a polynomial of this size to could take several seconds to factor, but in this case the running time is over 5 minutes on my machine, which seems excessive.

Indeed, Pari does it in a fraction of a second.

It looks like the power hack should be effective, but even without that, perhaps something is wrong.

#include "flint/fmpz_poly_factor.h"

const char * s = "1025  1 0 0 0 0 0 0 0 -1316593840 0 0 0 0 0 0 0 1254239907737033936 0 0 0 0 0 0 0 -1020500154868410158128259880 0 0 0 0 0 0 0 742657645113910721858495922505609288 0 0 0 0 0 0 0 -488851428152742328213334784814350395516520960 0 0 0 0 0 0 0 287814046590098536056537417927970868680931874624839972 0 0 0 0 0 0 0 -144731432788491934782720809053881299753255348063430296359463560 0 0 0 0 0 0 0 52667418446964455183555119787670793060682682657537813854472928617806356 0 0 0 0 0 0 0 -6627372146457374840476684171598633576102247456530952583168212566763674899080 0 0 0 0 0 0 0 229808850679653576215220762470947999111032180953859275098084786754118572707929408 0 0 0 0 0 0 0 47244447534423196727705721422913104255453865886015290840067262460660337554236681038680 0 0 0 0 0 0 0 -8598781279258932570943204125327149003097070415673151850246519910810714143570100274286748702 0 0 0 0 0 0 0 540701870204849447390226082954705836726722861074416261309121908402549511726244938504093747464120 0 0 0 0 0 0 0 30722682940581202352438291542486365134326454366699360707706667539312670299047066359657762413342965644 0 0 0 0 0 0 0 -10091249799704011701023139981079139597965800532808795363420077405279553345534029151131003234276688814706320 0 0 0 0 0 0 0 918854935715631662119487597903572632508236291734752388205157260039596655041407319588793267016427474581423827906 0 0 0 0 0 0 0 184683888599830925569838236218486017226349050180930626801417764157419922118806842745116335861703061673468951582720 0 0 0 0 0 0 0 13842363407336159924692947011620784296637244031515736813921285210411537345747604922653148099415836121720351511925108 0 0 0 0 0 0 0 -1058739682767542904517937750831982923025094134407798499065928299726409117391203782300287325011261537008384833677659600 0 0 0 0 0 0 0 -407542738934829659626614158410104545029716335746556910645965352195238540893456402145893421251610543862639471887617605448 0 0 0 0 0 0 0 -44889924527494986885728274555606181914473709099244648210778261610312708286920150136460334229335393800934091608922081355880 0 0 0 0 0 0 0 165417349118320753388592638510911804774504682990824901290000522105912383403100338190546796897583521863755885957200555663012 0 0 0 0 0 0 0 814412405263708371325588177235203431851941789505034023749589208830541984698894407942481613187277620340133907001400009204491520 0 0 0 0 0 0 0 123992527471432355160564250589830197691956102938592732030049632797499125856155573232226921154701080890746377691681553451697576671 0 0 0 0 0 0 0 4995378323406758931783045841002345498286261966641299704198046527783410717273853620526178991326503752120571659351121834410612924720 0 0 0 0 0 0 0 140404954012452700166032754112751928352146404977929750660562280585349441106457120658706160548084409690211168777015792605041999566604 0 0 0 0 0 0 0 3291958748356854814839201303157050704640505995802279716587412350138091954559656554050867468505907346150930414615677766933007107089120 0 0 0 0 0 0 0 68358068735718516094251777113158906503228424130736744644043641684736893009605404041996051022355485846242655399789381771096948472299642 0 0 0 0 0 0 0 1282306386020184808737647643430622466149755000723025386040637669197236598496794036041936715504886451665448809642377539617721547714704680 0 0 0 0 0 0 0 21670065179274072025590493783683657522136220939796197760878531190746040699892379265300410217321591478228854128907906097025008510978671260 0 0 0 0 0 0 0 320320892673865449140520806497828968812336027240346697743148253565570509773737848486512447709152756461760493393141110229512743345385272520 0 0 0 0 0 0 0 3770681492372589535772443418801792439601892744613610596885631884422304624590160838337719248912539399511791537828356826069607489092106765475 0 0 0 0 0 0 0 25578146281021022549714196244283852592377664968529282406009008504855974900214806612705262695934419688811125549909721430622090490564099109520 0 0 0 0 0 0 0 136532451550610274794572531886957325613115017164695108559892878606862140665461546639945604456179070121664107458980516889433942932107657304360 0 0 0 0 0 0 0 634354500600985650208961597367566323354754616527871965664244652489852835419595006992861388926689773896908331356062622301528448302400386714880 0 0 0 0 0 0 0 2652350342316948247509721732335736766450204880030919559449731909079217487320624432693341853097593824307800900993067602066467872178227316861100 0 0 0 0 0 0 0 10016805754795508549555492679815715688592055427939697906411175119280739379488315088149111507610542314998832232103935126619818232623476295908400 0 0 0 0 0 0 0 33492490360852407086505215597449499396287490285314596680997803553567655056225097476860319502453027166244974620341886638179910509121189771417440 0 0 0 0 0 0 0 93270889329796625252696195707129081208400203135825963629048516265147424086808048973491743106503752774649557065474591823729856744685751572005520 0 0 0 0 0 0 0 179360783435407659471746353648252923543348732113461558072268449720427607112453607603311589849530306742653286910020598786765831248123472613612686 0 0 0 0 0 0 0 67519322208583245541123300989202759025170069426989907956026708408762857944837588581306512315557581416065388801707677140612580715423352242897840 0 0 0 0 0 0 0 -151504820634729506015764983850359316249349869553893492519711651557225516840726173073213980888651573883349578610581533815473605142033370939104384 0 0 0 0 0 0 0 -313076423795317093101478492505267023306706219679931941626810097764385044334229566057999948506420142296074930260468553514711913245333739003794400 0 0 0 0 0 0 0 -222849643995828550305670629616667966457242510069508988784951065966041864100356120536296121570571742758977181766980901314230412670681131174063552 0 0 0 0 0 0 0 140609175462680958727584698888441722937221041186958820642533190885923307058675395030263152070266403613261570753056578070210019205618560862417120 0 0 0 0 0 0 0 533118189907130535370987175598659272183982500490260517432129126090825064207689223214625332298640672348701388460213338945234878735401033527579912 0 0 0 0 0 0 0 595539555697449750524871578123050094331388103507278348244949773102359574983204969741318800275350896499737864172963430803654793792978612447795280 0 0 0 0 0 0 0 217269783864194126882256725113353625739827856178389389423003099118245095965828957995566248431759773108082728584364251675689384937763489939790641 0 0 0 0 0 0 0 -172656619651759380208260995517028829992248330397670635260364417716030064953551312848787994206183577215328176656024348784643716222191875133659840 0 0 0 0 0 0 0 47102899306335665527423393903434389208790454002656409210278268009012546481658758375640120112452523685502661697080158096252742077298462329528120 0 0 0 0 0 0 0 -6575384075660829739879455978508946422399851727060710613351317690226313955301422709143620254199535254030882819339982320251256911043486661537480 0 0 0 0 0 0 0 94646289983763022194451385289262778180135982468932803496667105046572909398689843122236944677234731674848118751453338734008956568612724562620 0 0 0 0 0 0 0 172279933357640825734899067347870509257563015147752245808702877228370887567566027079106049242771073027124960113185611855478392986238493120560 0 0 0 0 0 0 0 -37763601566790690683081469289590517118000569861247651322246177354508656950685819778105729873061842174488506745029372130516397759321061838780 0 0 0 0 0 0 0 2235616776623738130109787007380039531997578891766526321954706384647043446790486712836734373441976884160929786200719306320140291792511663640 0 0 0 0 0 0 0 793664203179892643499956494355129152448512395887047323794449765989078049641049155054645512157205132285188600653749825608778929834785795190 0 0 0 0 0 0 0 -215802123061430663992460938671656374568218184290991538449388465585980840213305288232389356476824969855114403176864381843338886209225396120 0 0 0 0 0 0 0 33596191901940459180743646973641307473268813184661518588627722614392831515143157486393815915723527412543632124443361666209350815761157600 0 0 0 0 0 0 0 -4116784903346242780865114574283031704029652043622481798996191584343287407244851588245692966145994757298389873840069842060952845481519400 0 0 0 0 0 0 0 435456705125138504284062365126984912970621208321694876027105097874340168633100977381962195714908893583648146179359809529105819851556906 0 0 0 0 0 0 0 -41211058988045978728428621468362715847611410346997709361006693859199724416708626891010095071039539836757031540209296595649384830936760 0 0 0 0 0 0 0 3519657711316422320088106359899521053405946474421542253704770602905803469837293795940188931356393525916874252705485170453123653526396 0 0 0 0 0 0 0 -267682140338354204701729274907978539771795610080794464286539641745007766619745748037689776365742499996057162276726285915765208969840 0 0 0 0 0 0 0 17486767463327764961045258701964777848725763340661679067187673857103693210235814820616091643070855195700177041079584122216243728033 0 0 0 0 0 0 0 -936226024552376077332732801974495152912828735135458204793983877583636491893839275172317948176767935332870690905786208772117266720 0 0 0 0 0 0 0 43980625538306036531916366125566374095750905418435136086308829961380862248442592097129213136954040613537506421361887908302432612 0 0 0 0 0 0 0 -1847888686484404015504460460903708698385674581106131975310422507187427611006955521935708154059993006934790819636689320951305720 0 0 0 0 0 0 0 69311351930157471828783280351855283144584625031222533883732880618107015500753796928626282023849674069543909945678153651235976 0 0 0 0 0 0 0 -2238962309599340594612039124287772049435949896937490259596021300682934389779464703388983103688054964155172262512387385338360 0 0 0 0 0 0 0 55859930050486805831423010863502804739451723099262127194996955914684367190124785926287186367288567413459034163303711729616 0 0 0 0 0 0 0 -655855774962382890607107927269584141103256010635684812792331524729613596308402059553046721986568490518750808534252529640 0 0 0 0 0 0 0 -25053450870766921769751541306230130275319318316382888001579133134278556161745265417905576517700531601415107919053079919 0 0 0 0 0 0 0 1550256920763329664652962415253082082952640952421308071388853542127441807321351356020108019038330031155968382585812760 0 0 0 0 0 0 0 -17179563116769713974306461170386105618477254074245597996454513601919945016610714241641890478771539109259477941801132 0 0 0 0 0 0 0 -503980570874305053962281395108370840508521018857040418386394136001670429968785206008573056364222133487845503869480 0 0 0 0 0 0 0 36270002682418879306332121435240535657302844640603910146763474593398176883631295537466848408827915248314886354652 0 0 0 0 0 0 0 -1005066768795370063122191685094790077178711508213445206615659736148095121514604818035239517323783677323602661200 0 0 0 0 0 0 0 14228874597153260054998962366633053425371991921715966023767267725696303046549865617811016878613412368498277336 0 0 0 0 0 0 0 -5862509566587541337167738216442724121225083462131438439905621575600658466285725973197899891444782618533640 0 0 0 0 0 0 0 -2931532303346894960609087241175052939264265787576733651692477497331495631967608546529160184194481268864342 0 0 0 0 0 0 0 -7468246143699566034295187371803225132340168033207643526273526159877183127763711431907101554753517953040 0 0 0 0 0 0 0 1209711582800893039825283175898228042966056146144319276960495326666881829985858434130008103960619840548 0 0 0 0 0 0 0 32019147617280543932196717157984390913752948822244954909266314873528910627006140559115175267170232120 0 0 0 0 0 0 0 447521737534317492444288088258381885475797786076457763040208483311063281731737021274403177398478644 0 0 0 0 0 0 0 3619933202240649653033346008369295165422562102182651547937259578668252789900922988788021687137240 0 0 0 0 0 0 0 8050064488018729151462339856111631046517780827312137836271346836930611307404484829702478692976 0 0 0 0 0 0 0 -308530499238464308428693550682603776882399246778071636865835094591256132711388278469610197000 0 0 0 0 0 0 0 -2640091714889015672034836755870686530031446397197408663415583374251654189131253216403734277 0 0 0 0 0 0 0 -2831819528157978596920038315177980095186714412079621478868752275793484688975868478063400 0 0 0 0 0 0 0 314230569868678700635974730317815999844275368325820942451233927801611638014432808871852 0 0 0 0 0 0 0 6294254752656533725277244671347978337047682974969578829368107730542619991547227136680 0 0 0 0 0 0 0 85695066323593813323703205501996317534563203818376637717785651588791512963743361592 0 0 0 0 0 0 0 974239405315581540687308811447321641811068025585255561095556332776436693667752640 0 0 0 0 0 0 0 9675871904559028604265143300516813216001262793658542967856231029595010364120416 0 0 0 0 0 0 0 83701863898665774320081877611824807427174277563855175612570936305723229339720 0 0 0 0 0 0 0 617537379809090078104381576087417066453917551770876232906907331076117850994 0 0 0 0 0 0 0 3574235591053361200017317766465265449092582815103807557135838412864037120 0 0 0 0 0 0 0 18742283361460351971507129663307001972952523266753007979587467990897212 0 0 0 0 0 0 0 79926801233558336101632363032409541162633886233717940961935444754840 0 0 0 0 0 0 0 278317044424014125290972505330443426361661443725207489681051756836 0 0 0 0 0 0 0 595556753766118721166827707508804100930378696237185378102164440 0 0 0 0 0 0 0 -1123929866449700005822134634786062280516045736903185986641664 0 0 0 0 0 0 0 -15543988167802923071382996954183921355820597625467509809000 0 0 0 0 0 0 0 18734698177337190630021423706284300262258726233007820163 0 0 0 0 0 0 0 1641926785967755443096319015903094464473958105208672040 0 0 0 0 0 0 0 18974289853233275490741934421425615066686880662863172 0 0 0 0 0 0 0 93589393564548981309025997454464543434373784295560 0 0 0 0 0 0 0 438587492515171875814674508645656652173531183036 0 0 0 0 0 0 0 969728826997938263721058988822080116694944960 0 0 0 0 0 0 0 568202163107714805611747513052169229966376 0 0 0 0 0 0 0 -2900386702931095260622675125256164425160 0 0 0 0 0 0 0 -498134498450281204667589814085773329 0 0 0 0 0 0 0 31407379805272625008005333378660000 0 0 0 0 0 0 0 -165630541583192857898167021769652 0 0 0 0 0 0 0 1256384343756041675324239811760 0 0 0 0 0 0 0 65311855794655369986497822712 0 0 0 0 0 0 0 76959810858457422980573640 0 0 0 0 0 0 0 1347337116472878007460196 0 0 0 0 0 0 0 1330221808950339863760 0 0 0 0 0 0 0 -1262875376901921329 0 0 0 0 0 0 0 236113639631600 0 0 0 0 0 0 0 139700339276 0 0 0 0 0 0 0 -7175816880 0 0 0 0 0 0 0 206926258 0 0 0 0 0 0 0 -2070360 0 0 0 0 0 0 0 52292 0 0 0 0 0 0 0 -40 0 0 0 0 0 0 0 1";

int main()
{
    fmpz_poly_factor_t fac;
    fmpz_poly_t poly;

    fmpz_poly_init(poly);
    fmpz_poly_set_str(poly, s);

    fmpz_poly_factor(fac, poly);
}

Output:

1
33  1 -24 296 -1392 2804 864 -2056 -2976 22770 -45744 51052 -52176 40024 -1440 -8496 14464 -12680 -1640 828 -768 706 592 136 -96 27 -8 -20 -16 8 0 0 0 1 ^ 1
33  1 24 280 1216 3188 1624 5832 11008 1010 -18296 -22156 -13792 31192 20744 7088 800 -15752 -3952 -92 1088 1986 -96 56 -320 27 -40 52 0 8 -8 0 0 1 ^ 1
33  1 24 296 1392 2804 -864 -2056 2976 22770 45744 51052 52176 40024 1440 -8496 -14464 -12680 1640 828 768 706 -592 136 96 27 8 -20 16 8 0 0 0 1 ^ 1
33  1 -24 280 -1216 3188 -1624 5832 -11008 1010 18296 -22156 13792 31192 -20744 7088 -800 -15752 3952 -92 -1088 1986 96 56 320 27 40 52 0 8 8 0 0 1 ^ 1
33  1 -16 312 -3440 27084 -163184 779576 -3031184 9795698 -26752736 62558992 -126532704 223090504 -344878016 469474464 -564415232 600344920 -565376616 471301004 -347331096 225782214 -129022456 64511964 -28048776 10518299 -3365856 906192 -201376 35960 -4960 496 -32 1 ^ 1
33  1 16 312 3440 27084 163184 779576 3031184 9795698 26752736 62558992 126532704 223090504 344878016 469474464 564415232 600344920 565376616 471301004 347331096 225782214 129022456 64511964 28048776 10518299 3365856 906192 201376 35960 4960 496 32 1 ^ 1
33  1 -16 -56 944 4044 -944 -14904 816 27250 -24 -13456 4152 9032 3040 9568 -1248 12888 -2776 12052 -2184 8326 -1232 4420 -432 1819 -88 560 -8 120 0 16 0 1 ^ 1
33  1 16 -56 -944 4044 944 -14904 -816 27250 24 -13456 -4152 9032 -3040 9568 1248 12888 2776 12052 2184 8326 1232 4420 432 1819 88 560 8 120 0 16 0 1 ^ 1
33  1 -16 196 -448 -1606 -864 8824 31616 45805 48824 50512 49376 37974 13840 18704 14016 7600 1600 -1432 368 2006 928 136 -64 2 8 80 16 -2 0 0 0 1 ^ 1
33  1 16 196 448 -1606 864 8824 -31616 45805 -48824 50512 -49376 37974 -13840 18704 -14016 7600 -1600 -1432 -368 2006 -928 136 64 2 -8 80 -16 -2 0 0 0 1 ^ 1
33  1 24 220 736 3898 7144 4872 10368 21485 984 26224 54208 -5418 14784 62448 -3200 3248 38408 -552 -352 14166 -16 -264 3200 2 -40 432 0 -2 32 0 0 1 ^ 1
33  1 -24 220 -736 3898 -7144 4872 -10368 21485 -984 26224 -54208 -5418 -14784 62448 3200 3248 -38408 -552 352 14166 16 -264 -3200 2 40 432 0 -2 -32 0 0 1 ^ 1
33  1 -16 264 -1816 6894 -12304 11296 -5344 125 5456 -4856 10712 4322 15760 20368 1792 -8192 -3616 32 1176 786 48 20 -352 34 72 -20 -8 10 0 -4 0 1 ^ 1
33  1 16 264 1816 6894 12304 11296 5344 125 -5456 -4856 -10712 4322 -15760 20368 -1792 -8192 3616 32 -1176 786 -48 20 352 34 -72 -20 8 10 0 -4 0 1 ^ 1
33  1 -16 -8 520 4334 10416 17696 28096 32893 -5296 -81288 -129744 -102046 -15736 72944 111408 98880 65624 39264 27104 21714 17504 13004 8784 5474 3144 1652 784 330 120 36 8 1 ^ 1
33  1 16 -8 -520 4334 -10416 17696 -28096 32893 5296 -81288 129744 -102046 15736 72944 -111408 98880 -65624 39264 -27104 21714 -17504 13004 -8784 5474 -3144 1652 -784 330 -120 36 -8 1 ^ 1
33  1 24 264 1104 3964 4016 8636 7296 -2035 17056 9144 -19888 15742 -4000 4388 -7008 -9572 1424 3652 416 736 -752 20 448 34 -128 -20 -8 10 0 -4 0 1 ^ 1
33  1 -24 264 -1104 3964 -4016 8636 -7296 -2035 -17056 9144 19888 15742 4000 4388 7008 -9572 -1424 3652 -416 736 752 20 -448 34 128 -20 8 10 0 -4 0 1 ^ 1
33  1 24 312 1640 3004 -3024 -14364 -23424 -26227 -10936 49992 135696 182014 149184 72764 9168 -10980 2344 19804 27104 24864 18944 13324 8824 5474 3144 1652 784 330 120 36 8 1 ^ 1
33  1 -24 312 -1640 3004 3024 -14364 23424 -26227 10936 49992 -135696 182014 -149184 72764 -9168 -10980 -2344 19804 -27104 24864 -18944 13324 -8824 5474 -3144 1652 -784 330 -120 36 -8 1 ^ 1
33  1 16 296 2848 18244 81664 258944 587904 961325 1128936 938692 536144 196394 31680 -18976 -27136 -19300 -8000 -1632 192 656 672 336 64 2 -8 -20 -16 -2 0 0 0 1 ^ 1
33  1 -16 296 -2848 18244 -81664 258944 -587904 961325 -1128936 938692 -536144 196394 -31680 -18976 27136 -19300 8000 -1632 -192 656 -672 336 -64 2 8 -20 16 -2 0 0 0 1 ^ 1
33  1 -16 -40 736 3908 7224 14592 17088 35885 18064 43484 -10272 20522 -21496 9248 -6240 7068 -3912 288 -992 1936 104 -464 -320 2 40 52 0 -2 -8 0 0 1 ^ 1
33  1 16 -40 -736 3908 -7224 14592 -17088 35885 -18064 43484 10272 20522 21496 9248 6240 7068 3912 288 992 1936 -104 -464 320 2 -40 52 0 -2 8 0 0 1 ^ 1
33  1 -16 280 -2304 11578 -34856 59992 -55712 27730 -18456 23844 -12352 -868 -4416 7528 1440 -2997 -3512 2148 4288 1936 -96 -464 -320 2 40 52 0 -2 -8 0 0 1 ^ 1
33  1 16 280 2304 11578 34856 59992 55712 27730 18456 23844 12352 -868 4416 7528 -1440 -2997 3512 2148 -4288 1936 96 -464 320 2 -40 52 0 -2 8 0 0 1 ^ 1
33  1 16 -24 -592 4154 -10016 24744 -31456 43090 -45744 48252 -48416 34204 -4480 2584 -14016 10315 -4720 1148 -288 16 592 336 -96 2 32 -20 -16 -2 0 0 0 1 ^ 1
33  1 -16 -24 592 4154 10016 24744 31456 43090 45744 48252 48416 34204 4480 2584 14016 10315 4720 1148 288 16 -592 336 96 2 -32 -20 16 -2 0 0 0 1 ^ 1
33  1 24 244 944 4014 3616 2256 -16544 -5950 -9584 24824 -1448 8852 -960 -6852 13392 -2117 -576 6632 -1384 -864 1248 20 -352 34 72 -20 -8 10 0 -4 0 1 ^ 1
33  1 -24 244 -944 4014 -3616 2256 16544 -5950 9584 24824 1448 8852 960 -6852 -13392 -2117 576 6632 1384 -864 -1248 20 352 34 -72 -20 8 10 0 -4 0 1 ^ 1
33  1 -16 172 -80 -2386 -3024 12656 45496 74818 79544 49992 -23904 -98476 -116536 -66076 9168 56955 65624 51864 35504 24864 18104 13004 8784 5474 3144 1652 784 330 120 36 8 1 ^ 1
33  1 16 172 80 -2386 3024 12656 -45496 74818 -79544 49992 23904 -98476 116536 -66076 -9168 56955 -65624 51864 -35504 24864 -18104 13004 -8784 5474 -3144 1652 -784 330 -120 36 -8 1 ^ 1

real    5m4,157s
user    5m3,986s
sys 0m0,124s
fredrik-johansson commented 3 years ago

Both NTL and Pari do it fast, so it is not just Pari that is lucky here:

sage: %timeit pol._factor_ntl()
1 loop, best of 3: 427 ms per loop
sage: %timeit pol._factor_pari()
1 loop, best of 3: 284 ms per loop
fredrik-johansson commented 3 years ago

For reference, the original polynomial has 256 local factors.

What if we deflate it?

void
fmpz_poly_factor_with_deflation(fmpz_poly_factor_t fac, const fmpz_poly_t poly)
{
    slong i, j, d;

    d = fmpz_poly_deflation(poly);

    if (d > 1)
    {
        fmpz_poly_t g;
        fmpz_poly_factor_t gfac;

        fmpz_poly_init(g);
        fmpz_poly_factor_init(gfac);

        fmpz_poly_deflate(g, poly, d);
        fmpz_poly_factor(gfac, g);

        for (i = 0; i < gfac->num; i++)
        {
            fmpz_poly_factor_t hfac;
            fmpz_poly_factor_init(hfac);
            fmpz_poly_inflate(gfac->p + i, gfac->p + i, d);
            fmpz_poly_factor(hfac, gfac->p + i);

            for (j = 0; j < hfac->num; j++)
                fmpz_poly_factor_insert(fac, hfac->p + j, hfac->exp[j]);

            fmpz_poly_factor_clear(hfac);
        }

        fmpz_poly_factor_clear(gfac);
        fmpz_poly_clear(g);
    }
    else
    {
        fmpz_poly_factor(fac, poly);
    }
}

int main()
{
    fmpz_poly_factor_t fac;
    fmpz_poly_t poly;

    fmpz_poly_init(poly);
    fmpz_poly_factor_init(fac);
    fmpz_poly_set_str(poly, s);

    fmpz_poly_factor_with_deflation(fac, poly);
    fmpz_poly_factor_print(fac);
}

Output:

1
33  1 -16 264 -1816 6894 -12304 11296 -5344 125 5456 -4856 10712 4322 15760 20368 1792 -8192 -3616 32 1176 786 48 20 -352 34 72 -20 -8 10 0 -4 0 1 ^ 1
33  1 16 264 1816 6894 12304 11296 5344 125 -5456 -4856 -10712 4322 -15760 20368 -1792 -8192 3616 32 -1176 786 -48 20 352 34 -72 -20 8 10 0 -4 0 1 ^ 1
33  1 -16 -8 520 4334 10416 17696 28096 32893 -5296 -81288 -129744 -102046 -15736 72944 111408 98880 65624 39264 27104 21714 17504 13004 8784 5474 3144 1652 784 330 120 36 8 1 ^ 1
33  1 16 -8 -520 4334 -10416 17696 -28096 32893 5296 -81288 129744 -102046 15736 72944 -111408 98880 -65624 39264 -27104 21714 -17504 13004 -8784 5474 -3144 1652 -784 330 -120 36 -8 1 ^ 1
33  1 -16 196 -448 -1606 -864 8824 31616 45805 48824 50512 49376 37974 13840 18704 14016 7600 1600 -1432 368 2006 928 136 -64 2 8 80 16 -2 0 0 0 1 ^ 1
33  1 16 196 448 -1606 864 8824 -31616 45805 -48824 50512 -49376 37974 -13840 18704 -14016 7600 -1600 -1432 -368 2006 -928 136 64 2 -8 80 -16 -2 0 0 0 1 ^ 1
33  1 -24 220 -736 3898 -7144 4872 -10368 21485 -984 26224 -54208 -5418 -14784 62448 3200 3248 -38408 -552 352 14166 16 -264 -3200 2 40 432 0 -2 -32 0 0 1 ^ 1
33  1 24 220 736 3898 7144 4872 10368 21485 984 26224 54208 -5418 14784 62448 -3200 3248 38408 -552 -352 14166 -16 -264 3200 2 -40 432 0 -2 32 0 0 1 ^ 1
33  1 -16 280 -2304 11578 -34856 59992 -55712 27730 -18456 23844 -12352 -868 -4416 7528 1440 -2997 -3512 2148 4288 1936 -96 -464 -320 2 40 52 0 -2 -8 0 0 1 ^ 1
33  1 -16 -24 592 4154 10016 24744 31456 43090 45744 48252 48416 34204 4480 2584 14016 10315 4720 1148 288 16 -592 336 96 2 -32 -20 16 -2 0 0 0 1 ^ 1
33  1 16 280 2304 11578 34856 59992 55712 27730 18456 23844 12352 -868 4416 7528 -1440 -2997 3512 2148 -4288 1936 96 -464 320 2 -40 52 0 -2 8 0 0 1 ^ 1
33  1 16 -24 -592 4154 -10016 24744 -31456 43090 -45744 48252 -48416 34204 -4480 2584 -14016 10315 -4720 1148 -288 16 592 336 -96 2 32 -20 -16 -2 0 0 0 1 ^ 1
33  1 -24 244 -944 4014 -3616 2256 16544 -5950 9584 24824 1448 8852 960 -6852 -13392 -2117 576 6632 1384 -864 -1248 20 352 34 -72 -20 8 10 0 -4 0 1 ^ 1
33  1 -16 172 -80 -2386 -3024 12656 45496 74818 79544 49992 -23904 -98476 -116536 -66076 9168 56955 65624 51864 35504 24864 18104 13004 8784 5474 3144 1652 784 330 120 36 8 1 ^ 1
33  1 24 244 944 4014 3616 2256 -16544 -5950 -9584 24824 -1448 8852 -960 -6852 13392 -2117 -576 6632 -1384 -864 1248 20 -352 34 72 -20 -8 10 0 -4 0 1 ^ 1
33  1 16 172 80 -2386 3024 12656 -45496 74818 -79544 49992 23904 -98476 116536 -66076 -9168 56955 -65624 51864 -35504 24864 -18104 13004 -8784 5474 -3144 1652 -784 330 -120 36 -8 1 ^ 1
33  1 16 312 3440 27084 163184 779576 3031184 9795698 26752736 62558992 126532704 223090504 344878016 469474464 564415232 600344920 565376616 471301004 347331096 225782214 129022456 64511964 28048776 10518299 3365856 906192 201376 35960 4960 496 32 1 ^ 1
33  1 -16 312 -3440 27084 -163184 779576 -3031184 9795698 -26752736 62558992 -126532704 223090504 -344878016 469474464 -564415232 600344920 -565376616 471301004 -347331096 225782214 -129022456 64511964 -28048776 10518299 -3365856 906192 -201376 35960 -4960 496 -32 1 ^ 1
33  1 16 -56 -944 4044 944 -14904 -816 27250 24 -13456 -4152 9032 -3040 9568 1248 12888 2776 12052 2184 8326 1232 4420 432 1819 88 560 8 120 0 16 0 1 ^ 1
33  1 -16 -56 944 4044 -944 -14904 816 27250 -24 -13456 4152 9032 3040 9568 -1248 12888 -2776 12052 -2184 8326 -1232 4420 -432 1819 -88 560 -8 120 0 16 0 1 ^ 1
33  1 -24 280 -1216 3188 -1624 5832 -11008 1010 18296 -22156 13792 31192 -20744 7088 -800 -15752 3952 -92 -1088 1986 96 56 320 27 40 52 0 8 8 0 0 1 ^ 1
33  1 -24 296 -1392 2804 864 -2056 -2976 22770 -45744 51052 -52176 40024 -1440 -8496 14464 -12680 -1640 828 -768 706 592 136 -96 27 -8 -20 -16 8 0 0 0 1 ^ 1
33  1 24 296 1392 2804 -864 -2056 2976 22770 45744 51052 52176 40024 1440 -8496 -14464 -12680 1640 828 768 706 -592 136 96 27 8 -20 16 8 0 0 0 1 ^ 1
33  1 24 280 1216 3188 1624 5832 11008 1010 -18296 -22156 -13792 31192 20744 7088 800 -15752 -3952 -92 1088 1986 -96 56 -320 27 -40 52 0 8 -8 0 0 1 ^ 1
33  1 -16 -40 736 3908 7224 14592 17088 35885 18064 43484 -10272 20522 -21496 9248 -6240 7068 -3912 288 -992 1936 104 -464 -320 2 40 52 0 -2 -8 0 0 1 ^ 1
33  1 -16 296 -2848 18244 -81664 258944 -587904 961325 -1128936 938692 -536144 196394 -31680 -18976 27136 -19300 8000 -1632 -192 656 -672 336 -64 2 8 -20 16 -2 0 0 0 1 ^ 1
33  1 16 -40 -736 3908 -7224 14592 -17088 35885 -18064 43484 10272 20522 21496 9248 6240 7068 3912 288 992 1936 -104 -464 320 2 -40 52 0 -2 8 0 0 1 ^ 1
33  1 16 296 2848 18244 81664 258944 587904 961325 1128936 938692 536144 196394 31680 -18976 -27136 -19300 -8000 -1632 192 656 672 336 64 2 -8 -20 -16 -2 0 0 0 1 ^ 1
33  1 24 264 1104 3964 4016 8636 7296 -2035 17056 9144 -19888 15742 -4000 4388 -7008 -9572 1424 3652 416 736 -752 20 448 34 -128 -20 -8 10 0 -4 0 1 ^ 1
33  1 24 312 1640 3004 -3024 -14364 -23424 -26227 -10936 49992 135696 182014 149184 72764 9168 -10980 2344 19804 27104 24864 18944 13324 8824 5474 3144 1652 784 330 120 36 8 1 ^ 1
33  1 -24 264 -1104 3964 -4016 8636 -7296 -2035 -17056 9144 19888 15742 4000 4388 7008 -9572 -1424 3652 -416 736 752 20 -448 34 128 -20 8 10 0 -4 0 1 ^ 1
33  1 -24 312 -1640 3004 3024 -14364 23424 -26227 10936 49992 -135696 182014 -149184 72764 -9168 -10980 -2344 19804 -27104 24864 -18944 13324 -8824 5474 -3144 1652 -784 330 -120 36 -8 1 ^ 1

real    0m1,176s
user    0m1,171s
sys 0m0,004s

Much better, but still 3x slower than NTL and 5x slower than Pari. Are we missing something?

I assume that if we want put automatic deflation into fmpz_poly_factor, it will have to be implemented more intelligently. But I don't know the strategies here.

wbhart commented 3 years ago

I don't know any strategies. There are examples where either approach would be slower. But if Pari/GP uses it automatically, I guess we better too.

wbhart commented 3 years ago

How long does this one spend in the root finding?

fredrik-johansson commented 3 years ago

The CLD bound root finding? The original, I don't know. With deflation, it spends most of its time in LLL.

fredrik-johansson commented 3 years ago

Callgrind output (with deflation):

8,016,598,633  ???:0x0000000000001090 [/lib/x86_64-linux-gnu/ld-2.27.so]
8,016,029,577  ???:_start [/home/fredrik/src/test/a.out]
8,016,029,566  /build/glibc-S7xCS9/glibc-2.27/csu/../csu/libc-start.c:(below main) [/lib/x86_64-linux-gnu/libc-2.27.so]
8,016,026,090  /home/fredrik/src/test/slowpoly.c:main
8,016,026,090  slowpoly.c:main [/home/fredrik/src/test/a.out]
8,011,687,013  slowpoly.c:fmpz_poly_factor_with_deflation [/home/fredrik/src/test/a.out]
8,011,340,081  /home/fredrik/src/flint2/fmpz_poly_factor/factor.c:fmpz_poly_factor [/home/fredrik/src/flint2/libflint.so.15.0.0]
7,693,162,566  /home/fredrik/src/flint2/fmpz_poly_factor/factor_zassenhaus.c:_fmpz_poly_factor_zassenhaus [/home/fredrik/src/flint2/libflint.so.15.0.0]
7,524,390,018  /home/fredrik/src/flint2/fmpz_poly_factor/factor_van_hoeij.c:fmpz_poly_factor_van_hoeij [/home/fredrik/src/flint2/libflint.so.15.0.0]
7,076,835,235  /home/fredrik/src/flint2/fmpz_lll/wrapper_with_removal_knapsack.c:fmpz_lll_wrapper_with_removal_knapsack [/home/fredrik/src/flint2/libflint.so.15.0.0]
5,599,816,394  /home/fredrik/src/flint2/fmpz_lll/is_reduced_with_removal.c:fmpz_lll_is_reduced_with_removal [/home/fredrik/src/flint2/libflint.so.15.0.0]
3,268,942,802  /home/fredrik/src/flint2/fmpz_lll/is_reduced_mpfr_with_removal.c:fmpz_lll_is_reduced_mpfr_with_removal [/home/fredrik/src/flint2/libflint.so.15.0.0]
2,872,394,384  /home/fredrik/src/flint2/mpfr_mat/mul_classical.c:mpfr_mat_mul_classical [/home/fredrik/src/flint2/libflint.so.15.0.0]
1,477,016,351  /home/fredrik/src/flint2/fmpz_lll/d_lll.c:fmpz_lll_d_with_removal_knapsack [/home/fredrik/src/flint2/libflint.so.15.0.0]
1,416,154,153  ???:mpfr_add [/usr/local/lib/libmpfr.so.6.0.1]
1,356,663,993  ???:mpfr_mul [/usr/local/lib/libmpfr.so.6.0.1]
1,229,747,419  /home/fredrik/src/flint2/fmpz_lll/babai.c:fmpz_lll_check_babai [/home/fredrik/src/flint2/libflint.so.15.0.0]
1,131,483,136  /home/fredrik/src/flint2/fmpz_mat/is_reduced_with_removal.c:fmpz_mat_is_reduced_with_removal [/home/fredrik/src/flint2/libflint.so.15.0.0]
1,101,834,458  ???:__gmpn_gcd_22'2 [/usr/local/lib/libgmp.so.10.4.1]
  953,068,994  /build/glibc-S7xCS9/glibc-2.27/elf/../sysdeps/x86_64/dl-trampoline.h:_dl_runtime_resolve_xsave [/lib/x86_64-linux-gnu/ld-2.27.so]
  743,846,766  /home/fredrik/src/flint2/fmpq/submul.c:fmpq_submul [/home/fredrik/src/flint2/libflint.so.15.0.0]
  737,541,774  /home/fredrik/src/flint2/fmpq/submul.c:_fmpq_submul [/home/fredrik/src/flint2/libflint.so.15.0.0]
  662,097,727  /home/fredrik/src/flint2/mpfr_mat/mul_classical.c:mpfr_mat_mul_classical'2 [/home/fredrik/src/flint2/libflint.so.15.0.0]
  627,296,348  /home/fredrik/src/flint2/fmpz_vec/scalar_submul_si.c:_fmpz_vec_scalar_submul_si [/home/fredrik/src/flint2/libflint.so.15.0.0]
  564,919,576  /home/fredrik/src/flint2/fmpz_lll/is_reduced_d_with_removal.c:fmpz_lll_is_reduced_d_with_removal [/home/fredrik/src/flint2/libflint.so.15.0.0]
  533,742,000  /home/fredrik/src/flint2/fmpz_poly/hensel_lift_tree_recursive.c:fmpz_poly_hensel_lift_tree_recursive'2 [/home/fredrik/src/flint2/libflint.so.15.0.0]
  501,442,976  /build/glibc-S7xCS9/glibc-2.27/elf/../sysdeps/x86_64/dl-trampoline.h:_dl_runtime_resolve_xsave'2 [/lib/x86_64-linux-gnu/ld-2.27.so]
  479,819,014  /home/fredrik/src/flint2/d_mat/mul_classical.c:d_mat_mul_classical [/home/fredrik/src/flint2/libflint.so.15.0.0]
  445,452,619  /home/fredrik/src/flint2/fmpq/mul.c:_fmpq_mul [/home/fredrik/src/flint2/libflint.so.15.0.0]
  438,399,440  /home/fredrik/src/flint2/fmpz_poly/hensel_lift_tree.c:fmpz_poly_hensel_lift_tree [/home/fredrik/src/flint2/libflint.so.15.0.0]
  438,386,030  /home/fredrik/src/flint2/fmpz_poly/hensel_lift_tree_recursive.c:fmpz_poly_hensel_lift_tree_recursive [/home/fredrik/src/flint2/libflint.so.15.0.0]
  413,224,744  /home/fredrik/src/flint2/fmpq_vec/dot.c:_fmpq_vec_dot [/home/fredrik/src/flint2/libflint.so.15.0.0]
  407,771,352  /home/fredrik/src/flint2/fmpq/addmul.c:fmpq_addmul [/home/fredrik/src/flint2/libflint.so.15.0.0]
  400,950,224  /home/fredrik/src/flint2/fmpq/addmul.c:_fmpq_addmul [/home/fredrik/src/flint2/libflint.so.15.0.0]
  392,426,537  /home/fredrik/src/flint2/fmpz_poly/hensel_start_lift.c:_fmpz_poly_hensel_start_lift [/home/fredrik/src/flint2/libflint.so.15.0.0]
  379,691,871  /home/fredrik/src/flint2/fmpq/sub.c:_fmpq_sub [/home/fredrik/src/flint2/libflint.so.15.0.0]
  320,092,560  /home/fredrik/src/flint2/fmpz/submul_ui.c:fmpz_submul_ui [/home/fredrik/src/flint2/libflint.so.15.0.0]
  306,989,315  /home/fredrik/src/flint2/fmpz_mod_poly/divrem_divconquer.c:_fmpz_mod_poly_divrem_divconquer [/home/fredrik/src/flint2/libflint.so.15.0.0]
  298,317,615  /home/fredrik/src/flint2/mpfr_mat/init.c:mpfr_mat_init [/home/fredrik/src/flint2/libflint.so.15.0.0]
  297,778,765  /home/fredrik/src/flint2/fmpz_poly/hensel_lift.c:fmpz_poly_hensel_lift [/home/fredrik/src/flint2/libflint.so.15.0.0]
  297,449,193  /home/fredrik/src/flint2/fmpz_poly/hensel_lift.c:_fmpz_poly_hensel_lift [/home/fredrik/src/flint2/libflint.so.15.0.0]
  296,850,278  /home/fredrik/src/flint2/fmpz/gcd.c:fmpz_gcd [/home/fredrik/src/flint2/libflint.so.15.0.0]
  293,910,135  /home/fredrik/src/flint2/fmpz_mod_poly/divrem_divconquer.c:__fmpz_mod_poly_divrem_divconquer [/home/fredrik/src/flint2/libflint.so.15.0.0]
  284,247,326  /build/glibc-S7xCS9/glibc-2.27/elf/../sysdeps/x86_64/tls_get_addr.S:__tls_get_addr [/lib/x86_64-linux-gnu/ld-2.27.so]
  283,446,494  ???:mpfr_add1sp [/usr/local/lib/libmpfr.so.6.0.1]
  274,538,130  ???:mpfr_init2 [/usr/local/lib/libmpfr.so.6.0.1]
  274,290,560  /home/fredrik/src/flint2/fmpz_poly/hensel_lift_without_inverse.c:_fmpz_poly_hensel_lift_without_inverse [/home/fredrik/src/flint2/libflint.so.15.0.0]
  264,825,458  /home/fredrik/src/flint2/fmpz/addmul_ui.c:fmpz_addmul_ui [/home/fredrik/src/flint2/libflint.so.15.0.0]
  253,388,510  ???:mpfr_set4 [/usr/local/lib/libmpfr.so.6.0.1]
  245,682,040  ???:mpfr_allocate_func [/usr/local/lib/libmpfr.so.6.0.1]
  243,694,888  /home/fredrik/src/flint2/fmpz_vec/get_d_vec_2exp.c:_fmpz_vec_get_d_vec_2exp [/home/fredrik/src/flint2/libflint.so.15.0.0]
  237,870,824  /home/fredrik/src/flint2/nmod_poly_factor/factor_equal_deg.c:nmod_poly_factor_equal_deg'2 [/home/fredrik/src/flint2/libflint.so.15.0.0]
  236,620,514  /home/fredrik/src/flint2/fmpz_mod_poly/divrem_divconquer_recursive.c:_fmpz_mod_poly_divrem_divconquer_recursive [/home/fredrik/src/flint2/libflint.so.15.0.0]
  234,407,496  /build/glibc-S7xCS9/glibc-2.27/math/./s_ldexp_template.c:ldexp [/lib/x86_64-linux-gnu/libc-2.27.so]
  226,621,414  /build/glibc-S7xCS9/glibc-2.27/malloc/malloc.c:malloc [/lib/x86_64-linux-gnu/libc-2.27.so]
  214,431,334  /home/fredrik/src/flint2/fmpq/add.c:_fmpq_add [/home/fredrik/src/flint2/libflint.so.15.0.0]
  211,645,660  ???:__gmp_default_allocate [/usr/local/lib/libgmp.so.10.4.1]
  206,183,191  /home/fredrik/src/flint2/fmpz_mod_poly.h:_fmpz_poly_hensel_lift_without_inverse
  191,480,072  /home/fredrik/src/flint2/mpfr_mat/clear.c:mpfr_mat_clear [/home/fredrik/src/flint2/libflint.so.15.0.0]
  186,485,501  /home/fredrik/src/flint2/fmpz_poly_factor/CLD_mat.c:_fmpz_poly_factor_CLD_mat [/home/fredrik/src/flint2/libflint.so.15.0.0]
  185,279,196  /build/glibc-S7xCS9/glibc-2.27/malloc/malloc.c:_int_malloc [/lib/x86_64-linux-gnu/libc-2.27.so]
  184,207,300  /home/fredrik/src/flint2/nmod_poly_factor/factor.c:nmod_poly_factor [/home/fredrik/src/flint2/libflint.so.15.0.0]
  184,207,062  /home/fredrik/src/flint2/nmod_poly_factor/factor.c:__nmod_poly_factor_deflation [/home/fredrik/src/flint2/libflint.so.15.0.0]
  184,110,008  /home/fredrik/src/flint2/nmod_poly_factor/factor.c:__nmod_poly_factor [/home/fredrik/src/flint2/libflint.so.15.0.0]
  180,200,252  /home/fredrik/src/flint2/nmod_poly_factor/factor_kaltofen_shoup.c:nmod_poly_factor_kaltofen_shoup [/home/fredrik/src/flint2/libflint.so.15.0.0]
  178,382,206  ???:__gmpz_gcd [/usr/local/lib/libgmp.so.10.4.1]
  172,210,052  /home/fredrik/src/flint2/fmpz/divexact.c:fmpz_divexact [/home/fredrik/src/flint2/libflint.so.15.0.0]
  168,372,995  /home/fredrik/src/flint2/fmpz/get_d_2exp.c:fmpz_get_d_2exp [/home/fredrik/src/flint2/libflint.so.15.0.0]
  167,773,571  /home/fredrik/src/flint2/fmpz_poly/hensel_lift_only_inverse.c:_fmpz_poly_hensel_lift_only_inverse [/home/fredrik/src/flint2/libflint.so.15.0.0]
  163,466,932  /home/fredrik/src/flint2/fmpz_lll/babai.c:fmpz_lll_advance_check_babai [/home/fredrik/src/flint2/libflint.so.15.0.0]
  145,262,392  ???:mpfr_clear [/usr/local/lib/libmpfr.so.6.0.1]
  141,838,998  /build/glibc-S7xCS9/glibc-2.27/malloc/malloc.c:free [/lib/x86_64-linux-gnu/libc-2.27.so]
  140,844,678  ???:__gmpn_gcd [/usr/local/lib/libgmp.so.10.4.1]
  137,518,677  /home/fredrik/src/flint2/d_vec/dot.c:_d_vec_dot [/home/fredrik/src/flint2/libflint.so.15.0.0]
  137,058,317  /home/fredrik/src/flint2/mpfr_mat.h:mpfr_mat_mul_classical
  136,906,073  /home/fredrik/src/flint2/fmpz_mod_poly/divrem_basecase.c:_fmpz_mod_poly_divrem_basecase [/home/fredrik/src/flint2/libflint.so.15.0.0]
  135,562,259  /home/fredrik/src/flint2/fmpz/fmpz.c:_fmpz_clear_mpz [/home/fredrik/src/flint2/libflint.so.15.0.0]
  133,926,098  ???:mpfr_free_func [/usr/local/lib/libmpfr.so.6.0.1]
  133,863,909  /home/fredrik/src/flint2/fmpz_poly/hensel_lift_without_inverse.c:fmpz_poly_hensel_lift_without_inverse [/home/fredrik/src/flint2/libflint.so.15.0.0]
  133,798,264  /home/fredrik/src/flint2/fmpz_poly/CLD_bound.c:fmpz_poly_CLD_bound [/home/fredrik/src/flint2/libflint.so.15.0.0]
  132,029,237  /home/fredrik/src/flint2/fmpz_mod_poly/divrem_divconquer_recursive.c:_fmpz_mod_poly_divrem_divconquer_recursive'2 [/home/fredrik/src/flint2/libflint.so.15.0.0]
  131,297,683  /home/fredrik/src/flint2/fmpz/fmpz.c:_fmpz_promote [/home/fredrik/src/flint2/libflint.so.15.0.0]
  131,182,846  /home/fredrik/src/flint2/nmod_poly_factor/factor_equal_deg_prob.c:nmod_poly_factor_equal_deg_prob [/home/fredrik/src/flint2/libflint.so.15.0.0]
  127,846,547  /home/fredrik/src/flint2/nmod_poly_factor/factor_equal_deg.c:nmod_poly_factor_equal_deg [/home/fredrik/src/flint2/libflint.so.15.0.0]
  125,168,951  /home/fredrik/src/flint2/fmpz/mul.c:fmpz_mul [/home/fredrik/src/flint2/libflint.so.15.0.0]
  118,883,207  /home/fredrik/src/flint2/fmpz_mod_poly.h:_fmpz_poly_hensel_lift_only_inverse
  118,275,641  ???:mpfr_sub1sp [/usr/local/lib/libmpfr.so.6.0.1]
  117,420,559  /home/fredrik/src/flint2/fmpz_mod_poly/mul.c:_fmpz_mod_poly_mul [/home/fredrik/src/flint2/libflint.so.15.0.0]
  117,089,217  /home/fredrik/src/flint2/fmpz_poly/mul.c:_fmpz_poly_mul [/home/fredrik/src/flint2/libflint.so.15.0.0]
  115,851,742  /home/fredrik/src/flint2/fmpz/sub.c:fmpz_sub [/home/fredrik/src/flint2/libflint.so.15.0.0]
  115,649,266  /home/fredrik/src/flint2/fmpq/mul_small.c:_fmpq_mul_small [/home/fredrik/src/flint2/libflint.so.15.0.0]
  114,441,468  /home/fredrik/src/flint2/fmpz/mod.c:fmpz_mod [/home/fredrik/src/flint2/libflint.so.15.0.0]
  114,350,562  /home/fredrik/src/flint2/fmpz_vec/scalar_submul_fmpz.c:_fmpz_vec_scalar_submul_fmpz [/home/fredrik/src/flint2/libflint.so.15.0.0]
  110,140,003  /home/fredrik/src/flint2/d_mat/mul_classical.c:d_mat_mul_classical'2 [/home/fredrik/src/flint2/libflint.so.15.0.0]
  106,472,544  /home/fredrik/src/flint2/nmod_poly/powmod_mpz_binexp_preinv.c:nmod_poly_powmod_mpz_binexp_preinv [/home/fredrik/src/flint2/libflint.so.15.0.0]
  106,147,182  /home/fredrik/src/flint2/nmod_poly/powmod_mpz_binexp_preinv.c:_nmod_poly_powmod_mpz_binexp_preinv [/home/fredrik/src/flint2/libflint.so.15.0.0]
  100,619,736  /build/glibc-S7xCS9/glibc-2.27/math/../sysdeps/ieee754/dbl-64/wordsize-64/s_scalbn.c:__scalbn [/lib/x86_64-linux-gnu/libc-2.27.so]
   98,620,407  /home/fredrik/src/flint2/fmpz/add.c:fmpz_add [/home/fredrik/src/flint2/libflint.so.15.0.0]
   96,565,416  ???:__gmpn_get_d [/usr/local/lib/libgmp.so.10.4.1]
   95,466,192  /home/fredrik/src/flint2/fmpz/fmpz.c:_fmpz_new_mpz [/home/fredrik/src/flint2/libflint.so.15.0.0]
   94,095,096  ???:__gmp_default_free [/usr/local/lib/libgmp.so.10.4.1]
   93,868,491  /home/fredrik/src/flint2/nmod_poly/divrem_newton_n_preinv.c:_nmod_poly_divrem_newton_n_preinv [/home/fredrik/src/flint2/libflint.so.15.0.0]
   91,004,272  ???:mpfr_sub [/usr/local/lib/libmpfr.so.6.0.1]
   84,916,250  /home/fredrik/src/flint2/fmpz/fmpz.c:_fmpz_demote_val [/home/fredrik/src/flint2/libflint.so.15.0.0]
   82,009,181  /home/fredrik/src/flint2/fmpz_vec/sub.c:_fmpz_vec_sub [/home/fredrik/src/flint2/libflint.so.15.0.0]
   80,759,707  /home/fredrik/src/flint2/fmpz_vec/add.c:_fmpz_vec_add [/home/fredrik/src/flint2/libflint.so.15.0.0]
   80,095,901  /home/fredrik/src/flint2/ulong_extras/gcd.c:n_gcd [/home/fredrik/src/flint2/libflint.so.15.0.0]

I get the impression that checking whether matrices are reduced in LLL is being done highly inefficiently.

fredrik-johansson commented 3 years ago

So fmpz_lll_is_reduced_with_removal tries these things in order:

The time drops to 0.68 seconds if I turn off the MPFR version. It looks like the MPFR version gets called with prec = 53, so it will presumably not be more accurate than the double version and its only purpose to avoid overflow, correct? A double+exponent version should be faster, and surely we should skip MPFR/double+exponent verification entirely if the matrix is small enough that is_reduced_d does not overflow?

That leaves the slow fallback is_reduced using fmpq arithmetic. Can we do something better here?

wbhart commented 3 years ago

As far as I know there is nothing better we can do. You might get some satisfaction playing with the LLL parameters. However, the parameters we use are the ones required in Stehle/Nguyen's paper. It'll work with other parameters, but I don't have a paper to back it up.

This is something I think you could profitably talk to Bill and Karim about, since they may have an idea. Our LLL should be competitive with Pari's, so there must be something else.

You are right that there seems to be no point calling the mpfr version when prec = 53.

Unfortunately I do not have a clue about any further improvements. You've reached the end of my knowledge on it. I implemented everything I know (other than van Hoeij/Novocin, which isn't really relevant here, as it is probably not going to be faster).

fredrik-johansson commented 3 years ago

Do we have some benchmark suite to check for improvements/regressions, other than the TEST_HARD polynomials in fmpz_poly_factor/test?

fredrik-johansson commented 3 years ago

In any case, there are two orthogonal issues here: using the deflation hack, and improving the LLL. The former can evidently speed up factorisations by 300x or more in some cases, the other evidently by 2x or more in some cases.

Where to do the deflation? At the top level or for each squarefree factor? Does it need to be combined with some other heuristics?

wbhart commented 3 years ago

No, TEST_HARD is the benchmark for factorisation regressions.

I would do the deflation at the top level. But of course one must still factor each of the factors you find after inflating them again. Something can be irreducible when deflated, but reducible when reinflated.

wbhart commented 3 years ago

You can get some idea what Pari/GP does by turning on debug level 5 I believe. Simply type \g5 at the console.

wbhart commented 3 years ago

This is mostly solved by #937 however we still haven't caught all the way up, so the ticket can remain open. See #932 for a list of ideas that might help.