Bdot42 / mfakto

Mersenne number trial factoring using OpenCL, primarily for GIMPS: Great Internet Mersenne Prime Search
http://mersennewiki.org/index.php/Mfakto
GNU General Public License v3.0
33 stars 15 forks source link

Add support for Radeon Pro 560X #43

Open proski opened 3 months ago

proski commented 3 months ago

Originally posted at https://www.mersenneforum.org/node/11037?p=1052608#post1052608

Moving here for a more focused discussion.

I was able to test the latest mfakto on an iMac (21.5 inch, 2019) with "Radeon Pro 560X 4 GB", the latest MacOS Sonoma. From mfakto output:

OpenCL device info
name AMD Radeon Pro 560X Compute Engine (AMD)
device (driver) version OpenCL 1.2 (1.2 (Jul 29 2024 21:13:29))
maximum threads per block 256
maximum threads per grid 16777216
number of multiprocessors 16 (1024 compute elements)
clock rate 1004 MHz

Note that the name is different from typical names seen in set_gpu_type() in src/mfakto.cpp - no code name, no gfx.

The following testing was done after applying #39 that might fix errors with double precision.

Following GPUType values pass the short self-test: VLIW5, GCN3, GCNF, APU, CPU, NVIDIA and INTEL. All others (VLIW4, GCN, GCN2, GCN4, GCN5, RDNA) fail with

ERROR: self-test failed for M51916901 (cl_barrett15_74_gs)
no factor found​

Commenting out BARRETT74_MUL15 in the corresponding section in src/mfaktc.c fixes the failure. It appears that BARRETT74_MUL15 is not working on that GPU.

Setting SieveOnGPU=0 makes no difference except that the kernel name is cl_barrett15_74 without _gs. SmallExp=1 makes no difference either.

Double precision is supported, but slow. With GPUType=GCNF factoring M149691821 from 2^76 to 2^77 gives 193.22 GHz-d/day. The same configuration with -DSLOW_DP added in OCLCompileOptions gives 201.94 GHz-d/day.

./mfakto -st passed successfully for GPUType=GCNF with and without -DSLOW_DP.

I'll post the results of ./mfakto -st2 when I have a chance to run them.

Should this GPU be added to an existing GPU type (like GCNF) or be a separate type? How much do we care about the BARRETT74_MUL15 failure? Should it be specifically excluded for that GPU?

proski commented 3 months ago

Results of ./mfakto -st2:

Self-test statistics
  number of tests           335250
  successful tests          303138
  no factor found           32112

self-test FAILED!

ERROR: self-test failed, exiting.

I see some failures from BARRETT74_MUL15, which is expected. Unfortunately, there is no per-kernel summary. I'll need to re-run the test and capture its output to make sure no other kernels fail.

proski commented 3 months ago

Confirmed that all errors are from cl_barrett15_74_gs. It's interesting that the number of "no factor found" errors is 32100 now, 12 less than the previous time. There were 78 successes from cl_barrett15_74_gs.