Closed mathiaswagner closed 8 years ago
Some insight from gdb: breakpoint at the cbheckgauge call
#0 invertQuda (hp_x=0x2524b1b0, hp_b=0x2525a4c0, param=0x7fffffffcb90) at /home/mathias/qudagit/lib/interface_quda.cpp:2194
#1 0x00000000004698f0 in qudaInvert (external_precision=2, quda_precision=1, mass=0.0018, inv_args=..., target_residual=9.9999999999999995e-07, target_fermilab_residual=0, fatlink=0xce15300,
longlink=0xcecb710, tadpole=0.89000000000000001, source=0x2525a4c0, solution=0x2524b1b0, final_residual=0x7fffffffe270, final_fermilab_residual=0x7fffffffe278, num_iters=0x7fffffffe25c)
at /home/mathias/qudagit/lib/milc_interface.cpp:995
#2 0x00000000004551e7 in ks_congrad_parity_gpu ()
#3 0x0000000000457fda in ks_congrad_field ()
#4 0x0000000000457061 in mat_invert_uml_field ()
#5 0x000000000042bc24 in f_meas_imp_field ()
#6 0x0000000000405da9 in main ()
Looking into the milc_interface invert call
(gdb) print invalidate_quda_gauge
$1 = false
(gdb) print create_quda_gauge
$2 = true
So at line 977 we do not load a gauge field. However, as the resident gauge field is double precision and the request is for a single precision inversion we hit the error later. Looks like we need to add another check for reloading the gauge field at least here and maybe also in other place.
The error occurs when calculating pbp. Forcing an invalidateGauge
in qudaInvert
fixes the issue.
Not sure how we can best detect that this is needed.
This email message is for the sole use of the intended recipient(s) and may contain confidential information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by
Closing this now.
QUDA internal tracking of bug report by @lcosmai https://github.com/milc-qcd/milc_qcd/issues/5
MILC 7.7.13 and probably also 7.8.0 fails in RHMC with QUDA
To reproduce build MILC for double precision and enable all QUDA acceleration for HISQ:
run
error message: