etmc / tmLQCD

tmLQCD is a freely available software suite providing a set of tools to be used in lattice QCD simulations. This is mainly a HMC implementation (including PHMC and RHMC) for Wilson, Wilson Clover and Wilson twisted mass fermions and inverter for different versions of the Dirac operator. The code is fully parallelised and ships with optimisations for various modern architectures, such as commodity PC clusters and the Blue Gene family.
http://www.itkp.uni-bonn.de/~urbach/software.html
GNU General Public License v3.0
32 stars 47 forks source link

Quda work add actions bug measurements #519

Closed simone-romiti closed 2 years ago

simone-romiti commented 2 years ago

quda was not working with the following input (2kappaMu not specified)

BeginOperator CLOVER
  CSW = 1.76
  kappa = 0.15
  SolverPrecision = 1e-14
  MaxSolverIterations = 1000
  solver = mg
  UseEvenOdd = yes
  useexternalinverter = quda
  usesloppyprecision = single  
EndOperator

BeginMeasurement CORRELATORS
  MaxSolverIterations = 1000
  Frequency = 1
EndMeasurement
kostrzewa commented 2 years ago

Note that operator CLOVER with 2kappamu = 0.0 implies QUDA_TWIST_NO for the QUDA operator.

Marcogarofalo commented 2 years ago

with QUDA_TWIST_NO quda is complaining

MG level 0 (GPU): ERROR: twist flavors do not match: 1 0 (rank 0, host lnode15.cluster.hiskp, color_spinor_field.cpp:707 in checkField())
MG level 0 (GPU):        last kernel called was (name=N4quda15CopyColorSpinorILi4ELi3ENS_11colorspinor11FloatNOrderIfLi4ELi3ELi4ELb0ELb0EEENS2_IdLi4ELi3ELi2ELb0ELb0EEESt5tupleIJRNS_16ColorSpinorFieldERKS6_19QudaFieldLocation_sPfPKdEEEE,volume=4x8x8x8,aux=GPU-offline,vol=2048,precision=8,order=2,Ns=4,Nc=3vol=2048,precision=4,order=4,Ns=4,Nc=3,PreserveBasis)

instead with QUDA_TWIST_SINGLET is working fine and the online measurements are the same of the host version.

Probably QUDA is complaining because it is using an old MG setup with mu>0

kostrzewa commented 2 years ago

with QUDA_TWIST_NO quda is complaining Probably QUDA is complaining because it is using an old MG setup with mu>0

okay, but then we need to separate the cases:

1) pure Wilson clover HMC with Wilson clover online measurements 2) twisted mass clover HMC with twisted mass clover measurements

Any other combination does not make sense, although it is possible from the point of view of parameters.

Ideally, the MG setup would be forced to be reset when the operator type changes between HMC and online measurement (i.e., when we're running twisted clover HMC and perform a Wilson clover measurement or vice versa).

What I observed, however, is that twisted clover HMC + twisted clover online measurements don't work with the MG, is that correct?

Marcogarofalo commented 2 years ago

What I observed, however, is that twisted clover HMC + twisted clover online measurements don't work with the MG, is that correct?

I did not see any problem with this combination, more specific

BeginOperator CLOVER
  CSW = 1.76
  kappa = 0.15
  2kappamu = 0.0015846837
  SolverPrecision = 1e-14
  MaxSolverIterations = 1000
  solver = mg
  UseEvenOdd = yes
  useexternalinverter = quda
  usesloppyprecision = single
EndOperator

BeginMeasurement CORRELATORS
  MaxSolverIterations = 1000
  Frequency = 1
EndMeasurement

BeginMonomial CLOVERDET
  Timescale = 1
  kappa = 0.15
  2KappaMu = 0.0015846837
  CSW = 1.74
  rho = 0.09353509
  MaxSolverIterations = 1000
  AcceptancePrecision =  1.e-19
  ForcePrecision = 1.e-15
  Name = cloverdetlight
  solver = mg
  useexternalinverter = quda
  usesloppyprecision = single
EndMonomial

BeginMonomial CLOVERDETRATIO
  Timescale = 1
  kappa = 0.15
  2KappaMu = 0.0015846837
  rho = 0.01039279
  rho2 = 0.09353509
  CSW = 1.74
  MaxSolverIterations = 1000
  AcceptancePrecision =  1.e-19
  ForcePrecision = 1.e-16
  Name = cloverdetratio1light
  solver = mg
  useexternalinverter = quda
  usesloppyprecision = single
EndMonomial
kostrzewa commented 2 years ago

I did not see any problem with this combination, more specific

The residual check after the online measurement comes out correctly?

[...]
# Inversion done in N iterations, squared residue = 3.419632e+04!
# Inversion done in X sec. 
# : Time for correlators_measurement X s level: 1 proc_id: 0 /HMC/correlators_measurement
[...]
kostrzewa commented 2 years ago

About your input file: make sure that you have the same value of csw everywhere. I think the setup will always be rebuilt when you don't.