Closed tkoskela closed 2 years ago
qcd1
You will have to follow the instructions in the Dump_memory_guide.txt to dump the memory before and after the single precision Dirac-Wilson Operator. I'm pretty positive that the line numbers I specify there, are after compiling with the "-g" option and without any instrinsics, as by compiling the CPU code without any instrinsics the compiler will remove all those preprocessor directives and macros. You will also need to specify the local lattice sizes in each dimension (L0, L1, L2, L3) and you need to run it with a single core. I think that I used the branch feature/library to get the binary files from the CPU version. From what I remember, one of the differences of this branch is that it allows you to specify the size of the simulation on runtime and not compilation time, via an input file (see the DYNAMIC_SIZES). I've attached a sample compile_settings.txt and an input.in file here. You will have to manually create the log/, dat/ and cnfg/ directories in the path specified in the input.in
file (relative from the input.in
or absolute paths). Then use gdb
and dump the specified variables (piup, pidn, s, r, u, m). Save them as:
piup-L0-L1-L2-L3 pidn-L0-L1-L2-L3 sp-s-L0-L1-L2-L3 sp-r-L0-L1-L2-L3 sp-u-L0-L1-L2-L3 sp-m-L0-L1-L2-L3
The r
is the output so you should dump it after the loop. Those files will be read in the CUDA version here.
For example, if you used L0=L1=L2=L3=16 the files should be:
piup-16-16-16-16 pidn-16-16-16-16 sp-s-16-16-16-16 sp-r-16-16-16-16 sp-u-16-16-16-16 sp-m-16-16-16-16
and when you run the cuda version you should run it with:
executable L0 L1 L2 L3 /path/to/the/files
ex. executable 16 16 16 16 /path/to/the/files
Done in https://gitlab.com/fastsum/openqcd-fastsum/-/blob/feature/cuda_tests/tests/cuda2/main.c#L65-81
Makis' input files compile_settings.txt input.in.txt
CFLAGS
in compile_settings.txt
to have at least -O0 -g
and no intrinsics (AVX, QPX, etc)mpicc
make all
cnfg
, dat
, log
directories exist in run directory and are emptyinput.in
(base) [dc-kosk1@login-e-14 run]$ gdb ../build/qcd1
(gdb) break Dw.c:1392
Breakpoint 1 at 0x48c64a: file ../modules/dirac/Dw.c, line 1392.
(gdb) break Dw.c:1466
Breakpoint 2 at 0x48d22d: file ../modules/dirac/Dw.c, line 1466.
(gdb) r -i input.in
Starting program: /home/dc-kosk1/git_repos/openqcd-fastsum/run/../build/qcd1 -i input.in
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Breakpoint 1, openqcd_dirac__Dw (mu=0, s=0x75bb80, r=0x767b80) at ../modules/dirac/Dw.c:1392
1392 if (((cpr[0] == 0) && (bc != 3)) || ((cpr[0] == (NPROC0 - 1)) && (bc == 0))) {
Missing separate debuginfos, use: debuginfo-install glibc-2.17-322.el7_9.x86_64 libgcc-4.8.5-44.el7.x86_64 numactl-devel-2.0.12-5.el7.x86_64
(gdb) dump binary value VOLUME.bin openqcd__VOLUME
(gdb) dump binary value mu.bin mu
(gdb) dump binary memory piup.bin piup piup+openqcd__VOLUME*2
(gdb) dump binary memory pidn.bin pidn pidn+openqcd__VOLUME*2
(gdb) dump binary memory s.bin s s+openqcd__VOLUME
(gdb) dump binary memory r.bin r r+openqcd__VOLUME
(gdb) dump binary memory u.bin u u+openqcd__VOLUME*4
(gdb) dump binary memory m.bin m m+openqcd__VOLUME
(gdb) c
Continuing.
Breakpoint 2, openqcd_dirac__Dw (mu=0, s=0x75bb80, r=0x767b80) at ../modules/dirac/Dw.c:1466
1466 cps_ext_bnd(0x1, r);
(gdb) dump binary memory r2.bin r r+openqcd__VOLUME
(gdb) q
cp *.bin ~/rds/rds-dirac-dr004/openqcd/reference_data/
cd ~/rds/rds-dirac-dr004/openqcd/reference_data/
rename-bin-files.sh
pass L1 L2 L3 L4 as command line argumentsbreak Dw.c:1392
break Dw.c:1466
r -i input.in.txt
dump binary memory piup.bin piup piup+openqcd__VOLUME*2
dump binary memory pidn.bin pidn pidn+openqcd__VOLUME*2
dump binary memory s.bin s s+openqcd__VOLUME
dump binary memory u.bin u u+openqcd__VOLUME*4
dump binary memory m.bin m m+openqcd__VOLUME
c
dump binary memory r.bin r r+openqcd__VOLUME
q
Create a markdown doc in the repo