etmc / tmLQCD

tmLQCD is a freely available software suite providing a set of tools to be used in lattice QCD simulations. This is mainly a HMC implementation (including PHMC and RHMC) for Wilson, Wilson Clover and Wilson twisted mass fermions and inverter for different versions of the Dirac operator. The code is fully parallelised and ships with optimisations for various modern architectures, such as commodity PC clusters and the Blue Gene family.
http://www.itkp.uni-bonn.de/~urbach/software.html
GNU General Public License v3.0
32 stars 47 forks source link

Quda work add actions wflow #516

Closed Marcogarofalo closed 2 years ago

Marcogarofalo commented 2 years ago

Wflow is working but the output goes to stdout

Marcogarofalo commented 2 years ago

We do not know how to get in tmLQCD the variables like QudaGaugeObservableParam param computed in the Wflow

Marcogarofalo commented 2 years ago

This version works only with https://github.com/qcdcode/quda/commit/5028c3a8a3ea0655307339893084633de11c76ee

kostrzewa commented 2 years ago

@Marcogarofalo thanks. Any changes to QUDA should be based on the develop branch though (which contains all of feature/ndeg-twisted-clover), just to make merging them easier.

Besides this, I think it would be better if the array of QudaGaugeObservableParam were created in tmLQCD (or any other external program using the lib) and ~passed to performWFlownStep (with the existing behaviour the default when one passes nullptr)~. In any case I doubt that Kate would accept the use of new and delete.

kostrzewa commented 2 years ago

Ah, wait, I understand, sorry. Let's try it like this. I think, however, that newQudaGaugeParam should be called on all elements to make sure that any future changes don't break the initialisation.

Marcogarofalo commented 2 years ago

@Marcogarofalo thanks. Any changes to QUDA should be based on the develop branch though (which contains all of feature/ndeg-twisted-clover), just to make merging them easier.

ok I will try

Besides this, I think it would be better if the array of QudaGaugeObservableParam were created in tmLQCD (or any other external program using the lib). In any case I doubt that Kate would accept the use of new and delete.

Since we could not implement a dofoult value for a parameter in c we add a function in quda performWFlownStep_param that is the one really doing the computation and the old one performWFlownStep https://github.com/qcdcode/quda/blob/5028c3a8a3ea0655307339893084633de11c76ee/lib/interface_quda.cpp#L5597 is initializing a structure param and calling the other function performWFlownStep_param, so that we do not lose the backward compatibility

should we allocate the memory as we did in tmLQCD?

QudaGaugeObservableParam *param;
  param = (QudaGaugeObservableParam*) malloc(sizeof(QudaGaugeObservableParam) * (n_steps+1));
  for (int i=0;i<n_steps+1; i++){
    param[i] = newQudaGaugeObservableParam();   
    param[i].compute_plaquette = QUDA_BOOLEAN_TRUE;
    param[i].compute_qcharge = QUDA_BOOLEAN_TRUE; 
  }
  ...
  free(param)

Ah, wait, I understand, sorry. Let's try it like this. I think, however, that newQudaGaugeParam should be called on all elements to make sure that any future changes don't break the initialisation.

We add the initialization of the params with newQudaGaugeParam() in a following commit. https://github.com/qcdcode/quda/blob/5028c3a8a3ea0655307339893084633de11c76ee/lib/interface_quda.cpp#L5602

kostrzewa commented 2 years ago

Since we could not implement a dofoult value for a parameter in c we add a function in quda performWFlownStep_param that is the one really doing the computation and the old one performWFlownStep

Yes, I agree. I also agree with the new and delete pairing. I had misinterpreted your intent.

Having said that, the gradient flow (and other gauge smearing functions) in QUDA are quite odd in that they perform the measurements only as informative output. I think your miniminal modification proposal is a good way to see if Kate would be willing to accept further changes to make the output independent of the verbosity setting.

kostrzewa commented 2 years ago

Does this now produce the same output as tmLQCD's gradient flow? (up to the difference in the topological charge at small flow times discussed in https://github.com/lattice/quda/issues/959)

Marcogarofalo commented 2 years ago

Does this now produce the same output as tmLQCD's gradient flow? (up to the difference in the topological charge at small flow times discussed in lattice/quda#959)

Yes, also there is the addition of the QUDA output depending on its verbosity.

kostrzewa commented 2 years ago

Perfect, thank you. I'll merge this in once the corresponding commit lands in QUDA. Alternatively, you can also add some guards to enable/disable this at the configuration stage so being able to compile does not depend on the QUDA-PR being merged.

Marcogarofalo commented 2 years ago

I did not find I way to get the current verbosity of quda, that is why I add the function _setVerbosityQuda(). @kostrzewa @simone-romiti do you have something better?

simone-romiti commented 2 years ago

Maybe I misunderstood, but in quda/lib/util_quda.cpp you find the setVerbosity() and getVerbosity() functions. Is that what you're looking for?

kostrzewa commented 2 years ago

There's unfortunately no corresponding C interface function for getVerbosity(). Maybe this is something that we actually need to change in QUDA.

However, looking at our quda_interface.c, we set the general verbosity to QUDA_SUMMARIZE independently of the debug level. At this stage then, I think you can do what you do, although I would explicitly set QUDA_SILENT before the gradient flow if g_debug_level <= 2 and leave it at its default in all other cases.

Sorry, strike that, that was the case before quda_work_add_actions. I think what you're doing is fine.

kostrzewa commented 2 years ago

@Marcogarofalo sorry for taking so long... I'm now testing this (against the current develop branch which includes your modifications and the subsequent updates by Kate, Dean and Evan).

I see the following issue:

# TM_QUDA: Called _loadGaugeQuda for gauge_id: 0.051000
# TM_QUDA: Theta boundary conditions will be applied to gauge field
# TM_QUDA: Time for reorder_gauge_toQuda 1.085192e-02 s level: 3 proc_id: 0 /HMC/gradient_flow_measurement/compute_WFlow_quda/reorder_gauge_toQuda
# TM_QUDA: Time for loadGaugeQuda 1.625766e-02 s level: 3 proc_id: 0 /HMC/gradient_flow_measurement/compute_WFlow_quda/loadGaugeQuda
performWFlowQuda: ERROR: Invalid field location (rank 1, host cassiopeia, lattice_field.cpp:145 in void quda::LatticeField::create(const quda::LatticeFieldParam&)())
performWFlowQuda:        last kernel called was (name=N4quda9GaugePlaqIdLi3EL21QudaReconstructType_s18EEE,volume=16x16x16x20,aux=GPU-offline,nParity=2,vol=81920stride=40960precision=8geometry=4Nc=3r=0002)
performWFlowQuda: ERROR: Invalid field location (rank 0, host cassiopeia, lattice_field.cpp:145 in void quda::LatticeField::create(const quda::LatticeFieldParam&)())
performWFlowQuda:        last kernel called was (name=N4quda9GaugePlaqIdLi3EL21QudaReconstructType_s18EEE,volume=16x16x16x20,aux=GPU-offline,nParity=2,vol=81920stride=40960precision=8geometry=4Nc=3r=0002)
--------------------------------------------------------------------------

did you encounter this by any chance? I'll investigate until the end of the day now.

kostrzewa commented 2 years ago

It appears to be a regression in QUDA. I'll git-bisect and submit an issue.

kostrzewa commented 2 years ago

Works now thanks to dbc1e288315201f35f0069c60e4f936daad54d22 in QUDA.