Introduction to Fermat number testing with Mlucas

Mlucas is a powerful Mersenne primality testing and factoring package which supports Lucas–Lehmer testing, Fermat probable primality testing, and Pollard p–1 factorisation; justly, most applications of the software have concentrated on examination of the Mersenne numbers, however Mlucas is also capable of testing the primality of Fermat numbers, testing the compositeness of Fermat cofactors, and running p–1 factorisation on Fermat numbers. This post and others hope to fill in several lacunae in the Mlucas documentation with respect to Fermat number testing.

Mlucas build notes

The Fermat modular squaring code is architecture-specific, ideally requiring Mlucas to be built with one of the SIMD modes (AVX, AVX2, SSE2, etc). This should be handled automatically by the makemake.sh script for most hardware, including x86, however if Fermat modular squaring is not natively supported, there may be workarounds available.

For example, Apple Silicon is ASIMD, meaning a native Mlucas build cannot test Fermat numbers. The answer is to build an SSE2 Mlucas on an Intel-based Mac, which should be capable of running under Rosetta emulation on Apple Silicon, at the penalty of a significant performance hit for larger exponents.

(When compiling on Intel with the intention of building a compatible version for Apple Silicon, the Makefile requires static linking of the GMP library, by the following alteration to the linker flags: LDLIBS = -Bstatic ${LD_ARGS[@]} -Bdynamic Static linking of libraries is usually not recommended, and this may not be a solution in all possible cases.)

Fermat self-testing and populating fermat.cfg

Assuming you have a working build, Mlucas allows any Fermat number F_m = 2^{2^m} + 1 with exponent m in the range [14,63] to be entered as an argument for self-testing, by using the -f flag like so:

./Mlucas -f <exponent> -iters <number> [-fft <fft_length>]

However, the software only provides fast Fourier transforms (FFTs) up to 512M in length, which further limits the largest Fermat number that may be tested, up to F₃₃.

The -iters flag is mandatory and should be followed by a number less than a million; the recommended values for -iters to be set to are 100, 1000, or 10000, since the self-testing code has correct residues for all Fermat numbers [14,33] saved, and any error in the test result will be immediately discovered.

Mlucas does not support Fermat testing for all possible FFT lengths. Generally, FFTs for testing Mersenne numbers are available for any length k·2ⁿ K, where k = [8,15], n ≥ 0. In the case of Fermat numbers however, only the subset k = {8, 14, 15} supports Fermat testing, along with k = 63 when n ≥ 4.

When self-testing Mlucas for Mersenne numbers, a command such as ./Mlucas -s all configures Mlucas for the majority of FFT lengths and saves the results in mlucas.cfg. A similar process must be carried out for the Fermat numbers, using the ./Mlucas -f command above, which saves results in the file fermat.cfg.

A script config-fermat.sh has been written to automate the generation of the fermat.cfg file using the set of possible FFT lengths. After the license, the next few lines declare some variables which may be customised as desired (for example, 10000 iterations will require significant runtime at the largest possible FFT sizes):

# Mlucas
MLUCAS=./Mlucas

# Number of iterations
# use 100, 1000, or 10000 to match pre-computed values
ITERS=100

# Minimum Fermat number (14 or greater)
MIN=14

# Maximum Fermat number (33 or less)
MAX=29

The script should be invoked from the build directory, typically obj, with the following command, which may include as parameters any cpu-specific options, such as -core or -cpu:

../config-fermat.sh -cpu 0:3

Once self-testing has occurred, production work on Fermat numbers may be carried out using a worktodo file (Mlucas 20.x: worktodo.ini; Mlucas 21: worktodo.txt) with entries in one of the several formats below.

Only assignments for ECM factoring of Fermat numbers are distributed by Primenet, so work assignments cannot be obtained for Fermat numbers using the primenet.py script. However p–1 results may be submitted to mersenne.org as Manual results.

Primality testing: Pépin’s test

For the Pépin test, the worktodo format is simply:

Fermat,Test=<exponent>

Fermat numbers in the range [14,22] may select a non-optimal FFT length by default in production mode. If this is the case, the FFT may be overridden using the command:

./Mlucas -fft FFT_LENGTH -shift 0

The optimal FFT lengths vary from 4K up to 512M as shown in the table below.

Currently, all Fermat numbers up to F₃₀ have received a Pépin test, and moreover F₃₁ and F₃₂ are known to be composite. However, the character of their cofactors is unknown; and as for F₃₃, this is yet to be tested for primality.

The Pépin test uses Gerbicz error checking (GEC) to assure the reliability of the computation, however residue shifting is not implemented for Fermat modular squaring with GEC. Thus when invoking Mlucas the flag -shift 0 must be added to the Mlucas command line:

./Mlucas -shift 0

If non-zero residue shifting is used, the Pépin test will not be able to progress beyond the millionth iteration (the default interval for GEC) as error checking will be unable to confirm whether or not the computation is correct.

Cofactor compositeness testing: Suyama’s test

Suyama’s test is a Fermat probable primality test on the cofactor of a Fermat number. As a prerequisite, you must already have run a Pépin test, which should have saved a final residue as a file named e.g. f23 for a Pépin test of F₂₃. Do not try to run a Suyama test as a single work type incorporating the preceding Pépin test.

The worktodo format for the Suyama test is:

PRP=N/A,1,2,<2^m>,+1,99,0,3,5,"known_factor_1[,known_factor_2,...]"

Note that this work format uses the numeric value of 2^m of a Fermat number F_m, rather than just the exponent m. At least one known prime factor must be supplied; composite factors are disallowed.

Prior to testing, Mlucas tries a “sanity check” on each known factor, which has the effect of testing whether the Pépin residue R has been correctly calculated. This calculation can be performed at any point up to the final iteration. More information is available here.

Pollard p–1 factoring

Work entries for p–1 factoring have two variations, however only the Pminus1 format is supported for Fermat numbers:

Pminus1=N/A,1,2,<2^m>,+1,B1,B2[,trial_factoring_bits][,B2_start][,"known_factors"]

The same note as regards the exponent for the Suyama test applies here also; supplying known factors is optional. The -shift 0 flag is again a necessary addition to the Mlucas command line.

The testable Fermat numbers F_m = 2^{2^m} + 1

The following table lists the default FFT lengths selected by Mlucas for the Fermat numbers in the range [14,33] and the known factors, required for some of the test types above.

m	2^m	Default FFT	Optimal FFTs	known_factor(s)
14	16384	1K	4K	`"116928085873074369829035993834596371340386703423373313"`
15	32768	2K	4K	`"1214251009,2327042503868417,168768817029516972383024127016961"`
16	65536	3K*	4K	`"825753601,188981757975021318420037633"`
17	131072	6K*, 7K	7K, 8K	`"31065037602817,7751061099802522589358967058392886922693580423169"`
18	262144	13K*	14K, 15K, 16K	`"13631489,81274690703860512587777"`
19	524288	26K*	28K, 30K, 32K	`"70525124609,646730219521,37590055514133754286524446080499713"`
20	1048576	52K*	56K, 60K, 64K
21	2097152	104K*	112K, 120K, 128K	`"4485296422913"`
22	4194304	208K*	224K, 240K, 256K	`"64658705994591851009055774868504577"`
23	8388608	448K	448K, 480K, 512K	`"167772161"`
24	16777216	896K	896K, 960K, 1008K, 1M
25	33554432	1792K	1792K, 1920K, 2M	`"25991531462657,204393464266227713,2170072644496392193"`
26	67108864	3584K	3584K, 3840K, 4032K, 4M	`"76861124116481"`
27	134217728	7M, 7.5M	7M, 7.5M, 7.875M, 8M	`"151413703311361,231292694251438081"`
28	268435456	15M	14M, 15M, 15.75M, 16M	`"1766730974551267606529"`
29	536870912	30M	30M, 31.5M, 32M	`"2405286912458753"`
30	1073741824	60M	60M, 63M, 64M	`"640126220763137,1095981164658689"`
31	2147483648	120M	120M, 126M, 128M	`"46931635677864055013377"`
32	4294967296	256M	240M, 252M, 256M	`"25409026523137"`
33	8589934592	512M	504M, 512M

* These default FFTs selected by Mlucas have radices that cannot be used for Fermat modular squaring; thus F₁₆ to F₂₂ cannot be tested with Mlucas without overriding the default FFT selected, e.g.:

./Mlucas -fft 224K -shift 0

As mentioned under “Mlucas build notes” above, building Mlucas is highly architecture-specific, and some build types will not support all of the optimal FFT lengths. Finally, 4K appears to be the smallest usable FFT for the three smallest testable Fermat numbers.

This documentation is close to being ready, but I suspect more of the content will need to be refined as ongoing testing is still turning up a number of oddities.

As a result of this testing I can foresee a slew of issues to raise, and I am wondering whether you would like them to be raised as separate issues:

in production mode the FFT selection code for Fermat numbers 16 and [18,22] chooses unworkable lengths because of the limited radices supporting Fermats, which may also affect Mersennes of similarly small size (I can see reasons pro and contra); — issue #3 raised; the workaround seems to be successful for these Fermats. ETA: the documentation describes the workaround needed.
~~in production mode this default FFT then cannot be over-ridden in these cases (why?);~~ solved; mostly, except for F₁₄ which uses a ridiculous FFT of 7K on AVX2 (this is also the case for F₁₅ and F₁₆ but 7K is apparently workable). On SSE2 Mlucas can use 4K for these tiny exponents. ETA: the problem of the 7K FFT length is due to the fermat.cfg having a poor timing statistic for 4K relative to 7K. One gross workaround would be to simply change the data in fermat.cfg so that it is monotonically increasing, as described in the Mlucas README. Another issue, e.g. of the tiny 4K radix producing the CY routines with radix < 16 do not support shifted residues! error, is similarly worked around by editing the fermat.cfg file to use the preferred radix set (16 8 16).
again in production mode, the Pépin test for the cases of 17 and 23 doesn’t save the final residue at 2^m–1 iterations to the fm and qm save files, so the most recent checkpoint is unusable for further testing; — issue #4 raised; none of the completed Pépin tests for [15,20] and 23 saved. ETA: this is now fixed.
the various radix files have incorrect/misleading ‘7,8,9,15 and their multiples!’ errors about radices supporting Fermat modular squaring (the correct values are evidently 7,8,15,63);
Suyama test appears as though it cannot be carried out correctly, as a result of 3. — this will probably be trivial if issue #4 is solved; though it may be a separate bug relating to restarting an interrupted PRP test, so I’m waiting to see the outcome of solving one problem before raising this as well. ETA: there was one other bug preventing the Suyama test from correctly being carried out (the PRP base issue #5) which is now resolved.

There’s also a variety of errors that crop up during the running of config-fermat.sh which may be actual code errors versus numeric incompatibility of some FFT sizes with the number being tested.

In general, if you are able to, please do create a separate issue for each bug you find. That would be a huge help. Hopefully we will get volunteers to help debug and fix them all.

However, I believe I was able to address issues 2-3 and 5 (?) today on the forum. Issue 4 should be a trivial fix of just updating the message in those assert statements (assuming your findings are correct). That just leaves issue 1 as the remaining bug to be fixed. More testing would be needed to see if it affects Mersenne numbers. It may help to review the code and comments in get_fft_radices.c to determine how exactly the radices are currently selected.

I think this documentation is finally at the point where it can be added to the repository as docs/Fermat-testing.md, since subsequent updates should be very few in number.

Having dealt with most of the glaring bug-fixes, a few still remain and others possibly lurk undiscovered. Of the known issues: 1K and 2K FFTs generate a peculiar error in br.c about bit reversal and the size of the radix set being incorrect (when it doesn’t seem to be wrong, by the standards of other radix sets, at least to me).

ERROR: at line 257 of file ../src/br.c
Assertion failed: Exiting.
ERROR: product of radices [2] != vector length [1] in BIT_REVERSE_INT

Radix 8 causes problems with residue-shifted tests (and so occasionally do 7, 14, and 15), however Fermat testing shouldn’t be using non-zero residue shifts, as this effectively disables Gerbicz error checking.

CY routines with radix < 16 do not support shifted residues!

Amending the spuriously misleading ‘7,8,9,15 and their multiples!’ each of the 18 times it occurs in src is low priority.

ERROR: at line 256 of file ../src/radix12_ditN_cy_dif1.c
Assertion failed: radix12_ditN_cy_dif1: Fermat-mod only available for radices 7,8,9,15 and their multiples!

Other warnings are more likely to be actual problems of the computation being unworkable at the selected FFT.

fermat_mod_square: Warning: 26072 of 61440 out-of-range output digits detected.
100 iterations of F20 with FFT length 61440 = 60 K, final residue shift count = 465037

For instance despite all best intentions sometimes a computation will exceed the safe round-off error (ROE).

F24 Roundoff warning on iteration        6, maxerr =   0.499902486801

The “sanity check” of mod-squaring modulo prime factors, prior to running the cofactor test, only outputs to stdout, when it perhaps ought to be written to f_m.stat, since inequality is a definitive check that the Pépin residue is incorrect.

Computing 16383-squaring residue R (mod known prime q = 116928085873074369829035993834596371340386703423373313)
    A: R == 52412083880604209639887539420692164580669760318107863 (mod q)
    B: R == 52412083880604209639887539420692164580669760318107863 (mod q)

Having dealt with most of the glaring bug-fixes, a few still remain and others possibly lurk undiscovered.

Thanks again for testing everything and documenting all these remaining bugs. Free free to open issues for any of them you feel should be prioritized. It likely would require someone knowledgeable about Ernst's FFT and assembly code to fix them.

only outputs to stdout, when it perhaps ought to be written to f_m.stat, since inequality is a definitive check that the Pépin residue is incorrect.

There are several cases where it might be helpful to also write information to the *.stat file. As a workaround, one could use the tee command to save all the output to a file: ./Mlucas ... 2>&1 | tee -a Mlucas.out. Alternatively, if one does not want to display the output, they could use the nohup command as described in Ernst's README: nohup ./Mlucas ... >Mlucas.out &.

Let me know if you would like to create a PR with this documentation or if you want me to just commit it directly to the repository. We could also add a link to the README for it. Alternatively, you could add it to the Wiki, which would allow you to directly edit it at any time, but that may not be as discoverable for users.

BTW, in case you did not already notice, I added your requested feature to my GIMPS status script a couple of days ago.

Hi Teal, as you may have seen there are a few more issues in the thread over at the forum (thanks for raising #13!) and I will add some of these shortly. I have a few further tweaks to make the .md above, and then once these are done I think you should be fine committing it to the repository. The last edit was two or three weeks ago so I will not touch it until I’m satisfied with all of the changes.

Yes, thanks in advance for doing that. As time permits, I have been trying to create issues for some of the critical show stopping bugs users report, but I know there are a lot more important bugs (as you have noted) that still need issues and further debugging.

OK, sounds like a plan. Please feel free to take as much time as you need on your documentation.

Updated draft is now in the branch looking at the modular division bug #23, to be added to the main branch whenever that is resolved.

primesearch / Mlucas

Fermat-testing.md #1

Introduction to Fermat number testing with Mlucas

Mlucas build notes

Fermat self-testing and populating fermat.cfg

Primality testing: Pépin’s test

Cofactor compositeness testing: Suyama’s test

Pollard p–1 factoring

The testable Fermat numbers F_m = 2^{2^m} + 1

primesearch / Mlucas

Fermat-testing.md #1

Introduction to Fermat number testing with Mlucas

Mlucas build notes

Fermat self-testing and populating fermat.cfg

Primality testing: Pépin’s test

Cofactor compositeness testing: Suyama’s test

Pollard p–1 factoring

The testable Fermat numbers Fm = 22m + 1

The testable Fermat numbers F_m = 2^{2^m} + 1