Open lcosmai opened 8 years ago
Just to clarify:Which MILC version did you use.
I think (https://github.com/milc-qcd/milc_qcd.git Branch:master) corresponds to MILC 7.7.13, not 7.8.0 as mentioned in the bug title.
7.8.0 might well not be compatible with quda 0.8 as there have been quite a few changes affecting MILC.
You are right. I have just changed 7.8.0 to 7.7.13 in the title.
Hi Mathias,
To provide a more stable definition of MILC code versions on github,
last week we created two new branches, milc_qcd-7.7.13 and
milc_qcd-7.8.0. They are supposed to be release versions of the code.
The branch 7.7.13 is closest to the one Leonardo was using, and the
branch 7.8.0 is close to the current master branch. The master branch
is the development branch, so it will continue to evolve. It is
unlikely we will make any changes to the milc_qcd-7.7.13 and
milc_qcd-7.8.0 branches unless they are to fix critical bugs.
Eventually, we will copy the master branch to a new release branch. (I
think this is the model you also prefer.)
Best, Carleton
On 2/22/16 6:16 AM, Mathias Wagner wrote:
Just to clarify:Which MILC version did you use.
I think (https://github.com/milc-qcd/milc_qcd.git Branch:master) corresponds to MILC 7.7.13, not 7.8.0 as mentioned in the bug title.
7.8.0 might well not be compatible with quda 0.8 as there have been quite a few changes affecting MILC.
— Reply to this email directly or view it on GitHub https://github.com/milc-qcd/milc_qcd/issues/5#issuecomment-187168440.
Hi Carleton,
thanks for the correction. It looks like I was confused here. I will try to check QUDA 0.8 with
and try to reproduce the issue.
Sidenote: For QUDA we use a branch called develop for development and copy that over to a new release. We use master for the most recent release version (currently 0.8). This is to make sure that a git clone gives you a (hopefully) stable quda version.
Mathias
NVIDIA GmbH, Wuerselen, Germany, Amtsgericht Aachen, HRB 8361 Managing Director: Karen Theresa Burns
This email message is for the sole use of the intended recipient(s) and may contain confidential information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by
@lcosmai Can you share some more of the surrounding MILC output as well as your input file to MILC? This makes it easier to track down where the error was triggered.
@maddyscientist Not sure whether you are already reading so just wanted to make sure you are aware.
I shared (https://drive.google.com/folderview?id=0BxE4mI8SH7wsY2lianUyRDlCWG8&usp=sharing) the directory where the job has been launched. In the same directory there is also a README file with more details.
On 2/22/16 5:13 PM, Mathias Wagner wrote:
@lcosmai https://github.com/lcosmai Can you share some more of the surrounding MILC output as well as your input file to MILC? This makes it easier to track down where the error was triggered.
— Reply to this email directly or view it on GitHub https://github.com/milc-qcd/milc_qcd/issues/5#issuecomment-187248717.
Leonardo Cosmai INFN Bari Via Amendola 173 70126 Bari - Italy office: +39 080 5443207 mobile: +39 340 3580207
Ok. I managed to reproduce the issue by using the MILC provided sample input
~/milc_qcd/ks_imp_rhmc/test$ ../su3_rhmc_hisq su3_rhmc_hisq.2.sample-in
using quda 0.8 and MILC 7.7.13.
As this might be an issue either in MILC or in QUDA I also created https://github.com/lattice/quda/issues/439 to have a pointer in the QUDA issues tracker.
Setting
prec_pbp 2
seems to be a workaround. Still need to check why this worked with quda 0.7.2.
This will be fixed with quda 0.8.1. For now please stick to the workaround and lattice/quda#439
I compiled the target su3_rhmc_hisq for ks_imp_rhmc in the last stable release of MILC (https://github.com/milc-qcd/milc_qcd.git Branch:master) with quda v0.8.0 (https://github.com/lattice/quda.git Branch:master) using the following Makefile: https://drive.google.com/file/d/0BxE4mI8SH7wsSnZEaDQyeEQzVVE/view?usp=sharing
Then I performed a short test using 1 GPU. The job aborted with the following error message: ERROR: Solve precision 4 doesn't match gauge precision 8 (rank 0, host node496, interface_quda.cpp:1904 in checkGauge()) last kernel called was (name=N4quda22HeavyQuarkResidualNormI7double37double2S2_EE,volume=4x8x8x8,aux=vol=2048,stride=2304,precision=8)
Note that if I instead use quda v0.7.2, the same test job is completed without errors.