Closed xanthe-cat closed 5 months ago
This documentation is close to being ready, but I suspect more of the content will need to be refined as ongoing testing is still turning up a number of oddities.
As a result of this testing I can foresee a slew of issues to raise, and I am wondering whether you would like them to be raised as separate issues:
fermat.cfg
having a poor timing statistic for 4K relative to 7K. One gross workaround would be to simply change the data in fermat.cfg
so that it is monotonically increasing, as described in the Mlucas README. Another issue, e.g. of the tiny 4K radix producing the CY routines with radix < 16 do not support shifted residues!
error, is similarly worked around by editing the fermat.cfg
file to use the preferred radix set (16 8 16).There’s also a variety of errors that crop up during the running of config-fermat.sh
which may be actual code errors versus numeric incompatibility of some FFT sizes with the number being tested.
In general, if you are able to, please do create a separate issue for each bug you find. That would be a huge help. Hopefully we will get volunteers to help debug and fix them all.
However, I believe I was able to address issues 2-3 and 5 (?) today on the forum. Issue 4 should be a trivial fix of just updating the message in those assert statements (assuming your findings are correct). That just leaves issue 1 as the remaining bug to be fixed. More testing would be needed to see if it affects Mersenne numbers. It may help to review the code and comments in get_fft_radices.c to determine how exactly the radices are currently selected.
I think this documentation is finally at the point where it can be added to the repository as docs/Fermat-testing.md
, since subsequent updates should be very few in number.
Having dealt with most of the glaring bug-fixes, a few still remain and others possibly lurk undiscovered. Of the known issues:
1K and 2K FFTs generate a peculiar error in br.c
about bit reversal and the size of the radix set being incorrect (when it doesn’t seem to be wrong, by the standards of other radix sets, at least to me).
ERROR: at line 257 of file ../src/br.c
Assertion failed: Exiting.
ERROR: product of radices [2] != vector length [1] in BIT_REVERSE_INT
Radix 8 causes problems with residue-shifted tests (and so occasionally do 7, 14, and 15), however Fermat testing shouldn’t be using non-zero residue shifts, as this effectively disables Gerbicz error checking.
CY routines with radix < 16 do not support shifted residues!
Amending the spuriously misleading ‘7,8,9,15 and their multiples!’ each of the 18 times it occurs in src
is low priority.
ERROR: at line 256 of file ../src/radix12_ditN_cy_dif1.c
Assertion failed: radix12_ditN_cy_dif1: Fermat-mod only available for radices 7,8,9,15 and their multiples!
Other warnings are more likely to be actual problems of the computation being unworkable at the selected FFT.
fermat_mod_square: Warning: 26072 of 61440 out-of-range output digits detected.
100 iterations of F20 with FFT length 61440 = 60 K, final residue shift count = 465037
For instance despite all best intentions sometimes a computation will exceed the safe round-off error (ROE).
F24 Roundoff warning on iteration 6, maxerr = 0.499902486801
The “sanity check” of mod-squaring modulo prime factors, prior to running the cofactor test, only outputs to stdout, when it perhaps ought to be written to f_m.stat
, since inequality is a definitive check that the Pépin residue is incorrect.
Computing 16383-squaring residue R (mod known prime q = 116928085873074369829035993834596371340386703423373313)
A: R == 52412083880604209639887539420692164580669760318107863 (mod q)
B: R == 52412083880604209639887539420692164580669760318107863 (mod q)
Having dealt with most of the glaring bug-fixes, a few still remain and others possibly lurk undiscovered.
Thanks again for testing everything and documenting all these remaining bugs. Free free to open issues for any of them you feel should be prioritized. It likely would require someone knowledgeable about Ernst's FFT and assembly code to fix them.
only outputs to stdout, when it perhaps ought to be written to
f_m.stat
, since inequality is a definitive check that the Pépin residue is incorrect.
There are several cases where it might be helpful to also write information to the *.stat
file. As a workaround, one could use the tee
command to save all the output to a file: ./Mlucas ... 2>&1 | tee -a Mlucas.out
. Alternatively, if one does not want to display the output, they could use the nohup
command as described in Ernst's README: nohup ./Mlucas ... >Mlucas.out &
.
Let me know if you would like to create a PR with this documentation or if you want me to just commit it directly to the repository. We could also add a link to the README for it. Alternatively, you could add it to the Wiki, which would allow you to directly edit it at any time, but that may not be as discoverable for users.
BTW, in case you did not already notice, I added your requested feature to my GIMPS status script a couple of days ago.
Hi Teal, as you may have seen there are a few more issues in the thread over at the forum (thanks for raising #13!) and I will add some of these shortly. I have a few further tweaks to make the .md above, and then once these are done I think you should be fine committing it to the repository. The last edit was two or three weeks ago so I will not touch it until I’m satisfied with all of the changes.
Yes, thanks in advance for doing that. As time permits, I have been trying to create issues for some of the critical show stopping bugs users report, but I know there are a lot more important bugs (as you have noted) that still need issues and further debugging.
OK, sounds like a plan. Please feel free to take as much time as you need on your documentation.
Updated draft is now in the branch looking at the modular division bug #23, to be added to the main branch whenever that is resolved.
Introduction to Fermat number testing with Mlucas
Mlucas is a powerful Mersenne primality testing and factoring package which supports Lucas–Lehmer testing, Fermat probable primality testing, and Pollard p–1 factorisation; justly, most applications of the software have concentrated on examination of the Mersenne numbers, however Mlucas is also capable of testing the primality of Fermat numbers, testing the compositeness of Fermat cofactors, and running p–1 factorisation on Fermat numbers. This post and others hope to fill in several lacunae in the Mlucas documentation with respect to Fermat number testing.
Mlucas build notes
The Fermat modular squaring code is architecture-specific, ideally requiring Mlucas to be built with one of the SIMD modes (AVX, AVX2, SSE2, etc). This should be handled automatically by the
makemake.sh
script for most hardware, including x86, however if Fermat modular squaring is not natively supported, there may be workarounds available.For example, Apple Silicon is ASIMD, meaning a native Mlucas build cannot test Fermat numbers. The answer is to build an SSE2 Mlucas on an Intel-based Mac, which should be capable of running under Rosetta emulation on Apple Silicon, at the penalty of a significant performance hit for larger exponents.
(When compiling on Intel with the intention of building a compatible version for Apple Silicon, the Makefile requires static linking of the GMP library, by the following alteration to the linker flags:
LDLIBS = -Bstatic ${LD_ARGS[@]} -Bdynamic
Static linking of libraries is usually not recommended, and this may not be a solution in all possible cases.)Fermat self-testing and populating fermat.cfg
Assuming you have a working build, Mlucas allows any Fermat number Fm = 22m + 1 with exponent m in the range [14,63] to be entered as an argument for self-testing, by using the
-f
flag like so:./Mlucas -f <exponent> -iters <number> [-fft <fft_length>]
However, the software only provides fast Fourier transforms (FFTs) up to 512M in length, which further limits the largest Fermat number that may be tested, up to F33.
The
-iters
flag is mandatory and should be followed by a number less than a million; the recommended values for-iters
to be set to are 100, 1000, or 10000, since the self-testing code has correct residues for all Fermat numbers [14,33] saved, and any error in the test result will be immediately discovered.Mlucas does not support Fermat testing for all possible FFT lengths. Generally, FFTs for testing Mersenne numbers are available for any length k·2n K, where k = [8,15], n ≥ 0. In the case of Fermat numbers however, only the subset k = {8, 14, 15} supports Fermat testing, along with k = 63 when n ≥ 4.
When self-testing Mlucas for Mersenne numbers, a command such as
./Mlucas -s all
configures Mlucas for the majority of FFT lengths and saves the results inmlucas.cfg
. A similar process must be carried out for the Fermat numbers, using the./Mlucas -f
command above, which saves results in the filefermat.cfg
.A script
config-fermat.sh
has been written to automate the generation of thefermat.cfg
file using the set of possible FFT lengths. After the license, the next few lines declare some variables which may be customised as desired (for example, 10000 iterations will require significant runtime at the largest possible FFT sizes):The script should be invoked from the build directory, typically
obj
, with the following command, which may include as parameters any cpu-specific options, such as-core
or-cpu
:Once self-testing has occurred, production work on Fermat numbers may be carried out using a
worktodo
file (Mlucas 20.x:worktodo.ini
; Mlucas 21:worktodo.txt
) with entries in one of the several formats below.Only assignments for ECM factoring of Fermat numbers are distributed by Primenet, so work assignments cannot be obtained for Fermat numbers using the
primenet.py
script. However p–1 results may be submitted to mersenne.org as Manual results.Primality testing: Pépin’s test
For the Pépin test, the
worktodo
format is simply:Fermat numbers in the range [14,22] may select a non-optimal FFT length by default in production mode. If this is the case, the FFT may be overridden using the command:
The optimal FFT lengths vary from 4K up to 512M as shown in the table below.
Currently, all Fermat numbers up to F30 have received a Pépin test, and moreover F31 and F32 are known to be composite. However, the character of their cofactors is unknown; and as for F33, this is yet to be tested for primality.
The Pépin test uses Gerbicz error checking (GEC) to assure the reliability of the computation, however residue shifting is not implemented for Fermat modular squaring with GEC. Thus when invoking Mlucas the flag
-shift 0
must be added to the Mlucas command line:If non-zero residue shifting is used, the Pépin test will not be able to progress beyond the millionth iteration (the default interval for GEC) as error checking will be unable to confirm whether or not the computation is correct.
Cofactor compositeness testing: Suyama’s test
Suyama’s test is a Fermat probable primality test on the cofactor of a Fermat number. As a prerequisite, you must already have run a Pépin test, which should have saved a final residue as a file named e.g.
f23
for a Pépin test of F23. Do not try to run a Suyama test as a single work type incorporating the preceding Pépin test.The
worktodo
format for the Suyama test is:Note that this work format uses the numeric value of 2m of a Fermat number Fm, rather than just the exponent m. At least one known prime factor must be supplied; composite factors are disallowed.
Prior to testing, Mlucas tries a “sanity check” on each known factor, which has the effect of testing whether the Pépin residue R has been correctly calculated. This calculation can be performed at any point up to the final iteration. More information is available here.
Pollard p–1 factoring
Work entries for p–1 factoring have two variations, however only the
Pminus1
format is supported for Fermat numbers:The same note as regards the exponent for the Suyama test applies here also; supplying known factors is optional. The
-shift 0
flag is again a necessary addition to the Mlucas command line.The testable Fermat numbers Fm = 22m + 1
The following table lists the default FFT lengths selected by Mlucas for the Fermat numbers in the range [14,33] and the known factors, required for some of the test types above.
"116928085873074369829035993834596371340386703423373313"
"1214251009,2327042503868417,168768817029516972383024127016961"
"825753601,188981757975021318420037633"
"31065037602817,7751061099802522589358967058392886922693580423169"
"13631489,81274690703860512587777"
"70525124609,646730219521,37590055514133754286524446080499713"
"4485296422913"
"64658705994591851009055774868504577"
"167772161"
"25991531462657,204393464266227713,2170072644496392193"
"76861124116481"
"151413703311361,231292694251438081"
"1766730974551267606529"
"2405286912458753"
"640126220763137,1095981164658689"
"46931635677864055013377"
"25409026523137"
* These default FFTs selected by Mlucas have radices that cannot be used for Fermat modular squaring; thus F16 to F22 cannot be tested with Mlucas without overriding the default FFT selected, e.g.:
As mentioned under “Mlucas build notes” above, building Mlucas is highly architecture-specific, and some build types will not support all of the optimal FFT lengths. Finally, 4K appears to be the smallest usable FFT for the three smallest testable Fermat numbers.