Quda work add actions wflow

Marcogarofalo commented 2 years ago

Wflow is working but the output goes to stdout

Marcogarofalo commented 2 years ago

We do not know how to get in tmLQCD the variables like QudaGaugeObservableParam param computed in the Wflow

Marcogarofalo commented 2 years ago

This version works only with https://github.com/qcdcode/quda/commit/5028c3a8a3ea0655307339893084633de11c76ee

kostrzewa commented 2 years ago

@Marcogarofalo thanks. Any changes to QUDA should be based on the develop branch though (which contains all of feature/ndeg-twisted-clover), just to make merging them easier.

Besides this, I think it would be better if the array of QudaGaugeObservableParam were created in tmLQCD (or any other external program using the lib) and ~passed to performWFlownStep (with the existing behaviour the default when one passes nullptr)~. In any case I doubt that Kate would accept the use of new and delete.

kostrzewa commented 2 years ago

Ah, wait, I understand, sorry. Let's try it like this. I think, however, that newQudaGaugeParam should be called on all elements to make sure that any future changes don't break the initialisation.

Marcogarofalo commented 2 years ago

@Marcogarofalo thanks. Any changes to QUDA should be based on the develop branch though (which contains all of feature/ndeg-twisted-clover), just to make merging them easier.

ok I will try

Besides this, I think it would be better if the array of QudaGaugeObservableParam were created in tmLQCD (or any other external program using the lib). In any case I doubt that Kate would accept the use of new and delete.

Since we could not implement a dofoult value for a parameter in c we add a function in quda performWFlownStep_param that is the one really doing the computation and the old one performWFlownStep https://github.com/qcdcode/quda/blob/5028c3a8a3ea0655307339893084633de11c76ee/lib/interface_quda.cpp#L5597 is initializing a structure param and calling the other function performWFlownStep_param, so that we do not lose the backward compatibility

should we allocate the memory as we did in tmLQCD?

QudaGaugeObservableParam *param;
  param = (QudaGaugeObservableParam*) malloc(sizeof(QudaGaugeObservableParam) * (n_steps+1));
  for (int i=0;i<n_steps+1; i++){
    param[i] = newQudaGaugeObservableParam();   
    param[i].compute_plaquette = QUDA_BOOLEAN_TRUE;
    param[i].compute_qcharge = QUDA_BOOLEAN_TRUE; 
  }
  ...
  free(param)

Ah, wait, I understand, sorry. Let's try it like this. I think, however, that newQudaGaugeParam should be called on all elements to make sure that any future changes don't break the initialisation.

We add the initialization of the params with newQudaGaugeParam() in a following commit. https://github.com/qcdcode/quda/blob/5028c3a8a3ea0655307339893084633de11c76ee/lib/interface_quda.cpp#L5602

kostrzewa commented 2 years ago

Since we could not implement a dofoult value for a parameter in c we add a function in quda performWFlownStep_param that is the one really doing the computation and the old one performWFlownStep

Yes, I agree. I also agree with the new and delete pairing. I had misinterpreted your intent.

Having said that, the gradient flow (and other gauge smearing functions) in QUDA are quite odd in that they perform the measurements only as informative output. I think your miniminal modification proposal is a good way to see if Kate would be willing to accept further changes to make the output independent of the verbosity setting.

kostrzewa commented 2 years ago

Does this now produce the same output as tmLQCD's gradient flow? (up to the difference in the topological charge at small flow times discussed in https://github.com/lattice/quda/issues/959)

Marcogarofalo commented 2 years ago

Does this now produce the same output as tmLQCD's gradient flow? (up to the difference in the topological charge at small flow times discussed in lattice/quda#959)

Yes, also there is the addition of the QUDA output depending on its verbosity.

kostrzewa commented 2 years ago

Perfect, thank you. I'll merge this in once the corresponding commit lands in QUDA. Alternatively, you can also add some guards to enable/disable this at the configuration stage so being able to compile does not depend on the QUDA-PR being merged.

Marcogarofalo commented 2 years ago

I did not find I way to get the current verbosity of quda, that is why I add the function _setVerbosityQuda(). @kostrzewa @simone-romiti do you have something better?

simone-romiti commented 2 years ago

Maybe I misunderstood, but in quda/lib/util_quda.cpp you find the setVerbosity() and getVerbosity() functions. Is that what you're looking for?

kostrzewa commented 2 years ago

There's unfortunately no corresponding C interface function for getVerbosity(). Maybe this is something that we actually need to change in QUDA.

However, looking at our quda_interface.c, we set the general verbosity to QUDA_SUMMARIZE independently of the debug level. At this stage then, I think you can do what you do, although I would explicitly set QUDA_SILENT before the gradient flow if g_debug_level <= 2 and leave it at its default in all other cases.

Sorry, strike that, that was the case before quda_work_add_actions. I think what you're doing is fine.

kostrzewa commented 2 years ago

@Marcogarofalo sorry for taking so long... I'm now testing this (against the current develop branch which includes your modifications and the subsequent updates by Kate, Dean and Evan).

I see the following issue:

# TM_QUDA: Called _loadGaugeQuda for gauge_id: 0.051000
# TM_QUDA: Theta boundary conditions will be applied to gauge field
# TM_QUDA: Time for reorder_gauge_toQuda 1.085192e-02 s level: 3 proc_id: 0 /HMC/gradient_flow_measurement/compute_WFlow_quda/reorder_gauge_toQuda
# TM_QUDA: Time for loadGaugeQuda 1.625766e-02 s level: 3 proc_id: 0 /HMC/gradient_flow_measurement/compute_WFlow_quda/loadGaugeQuda
performWFlowQuda: ERROR: Invalid field location (rank 1, host cassiopeia, lattice_field.cpp:145 in void quda::LatticeField::create(const quda::LatticeFieldParam&)())
performWFlowQuda:        last kernel called was (name=N4quda9GaugePlaqIdLi3EL21QudaReconstructType_s18EEE,volume=16x16x16x20,aux=GPU-offline,nParity=2,vol=81920stride=40960precision=8geometry=4Nc=3r=0002)
performWFlowQuda: ERROR: Invalid field location (rank 0, host cassiopeia, lattice_field.cpp:145 in void quda::LatticeField::create(const quda::LatticeFieldParam&)())
performWFlowQuda:        last kernel called was (name=N4quda9GaugePlaqIdLi3EL21QudaReconstructType_s18EEE,volume=16x16x16x20,aux=GPU-offline,nParity=2,vol=81920stride=40960precision=8geometry=4Nc=3r=0002)
--------------------------------------------------------------------------

did you encounter this by any chance? I'll investigate until the end of the day now.

kostrzewa commented 2 years ago

It appears to be a regression in QUDA. I'll git-bisect and submit an issue.

kostrzewa commented 2 years ago

Works now thanks to dbc1e288315201f35f0069c60e4f936daad54d22 in QUDA.

etmc / tmLQCD

Quda work add actions wflow #516