QMCPACK / qmcpack

Main repository for QMCPACK, an open-source production level many-body ab initio Quantum Monte Carlo code for computing the electronic structure of atoms, molecules, and solids with full performance portable GPU support
http://www.qmcpack.org
Other
298 stars 139 forks source link

Can we simplify SpaceGrid input? #5222

Open ye-luo opened 1 day ago

ye-luo commented 1 day ago

I'm going through the manual section for space grid. It is new to me and treat me as a user. Input needs to be designed simple, straightforward, as much self-explained as possible. In the case of Cartesian.

<estimator type="EnergyDensity" name="EDcell" dynamic="e" static="ion0">
   <spacegrid coord="cartesian">
     <origin p1="zero"/>
     <axis p1="a1" scale=".5" label="x" grid="-1 (.05) 1"/>
     <axis p1="a2" scale=".5" label="y" grid="-1 (.1) 1"/>
     <axis p1="a3" scale=".5" label="z" grid="-1 (.1) 1"/>
   </spacegrid>
</estimator>

I first hit this p1 attribute. Then I realized that I need to understand reference_points which provides a set of points for later use in specifying the origin and coordinate axes needed to construct a spatial histogramming grid. There are a few predefined points like "zero", "a1", "a2", "a3" which are defined as points not vectors.

back to p1, based on the documentation of origin

p1/p2/fraction: The location of the origin is set to p1+fraction*(p2-p1). If only p1 is provided, the origin is at p1

Q1 how to specify an arbitrary point simply via "x y z" as origin? Explicit reference_points can fix this issue and thus there is no need to "p1/p2/fraction". Can we simplify this?

next is axis, I assume axis being a vector. Based on the definition of p1 it is a point. Q2 Is the vector defined from origin to p1 or "zero" to p1 when origin is not "zero"?

I feel better to use

     <origin location="zero"/> ! this should be default and can be omitted
     <axis desitination="a1" label="x"/> ! we define one unit on the 'x' axis as from the specified origin to destination. 

grid and scale. manual says "The allowed grid points fall in the range [-1,1] for label=x/y/z or [0,1] for r/phi/theta." I found it [-1. 1) a bit unusual choice and it also requires additional scale=".5" to address the rescaling. Q3 Why not using [0, 1)? Q4 manual says "scale" is only used when p2 exists. "The axis vector is set to p1+scale*(p2-p1). If only p1 is provided, the axis vector is p1." however, I feel scale=0.5 is necessary make "[-1 1)" representing one unit cell. I feel a lot simpler with [0, 1) and scale can be dropped.

The manual says "A grid of 10 evenly spaced points between 0 and 1 can be requested equivalently by grid="0 (0.1) 1" or grid="0 (10) 1."". Q5 can we stick to one style instead of two confusing ways? My pick is

     <axis desitination="a1" grid="0 (0.1) 1" /> 

Manual also mentions "Piecewise uniform grids covering portions of the range are supported, e.g., grid="-0.7 (10) 0.0 (20) 0.5."" Q6 Is this supported?

Then the spherical example "Energy density estimator accumulated within spheres of radius 6.9 Bohr centered on the first and second atoms in the ion0 particleset."

<estimator type="EnergyDensity" name="EDatom" dynamic="e" static="ion0">
  <reference_points coord="cartesian">
    r1 1 0 0
    r2 0 1 0
    r3 0 0 1
  </reference_points>
  <spacegrid coord="spherical">
    <origin p1="ion01"/>
    <axis p1="r1" scale="6.9" label="r"     grid="0 1"/>
    <axis p1="r2" scale="6.9" label="phi"   grid="0 1"/>
    <axis p1="r3" scale="6.9" label="theta" grid="0 1"/>
  </spacegrid>
  <spacegrid coord="spherical">
    <origin p1="ion02"/>
    <axis p1="r1" scale="6.9" label="r"     grid="0 1"/>
    <axis p1="r2" scale="6.9" label="phi"   grid="0 1"/>
    <axis p1="r3" scale="6.9" label="theta" grid="0 1"/>
  </spacegrid>
</estimator>

I'm further confused, a) r1, r2, r3 doesn't seem to be points but more like vectors. b) What is the definition of "scale", why "phi", "Theta" needs it?

Can we put some efforts to well define our input and then we get less struggle with implementing the feature?

aannabe commented 1 day ago

I also recently went through this part of the docs and found it confusing.

prckent commented 1 day ago

Edit to add: Clearly the answer is "yes", but whether we should is a more important question.

===

Original:

I agree that the input is not ideal. Possibly the manual is not quite right as well. There could be "off by one" errors and similar ambiguities in the write up.

That said, do we have a specific examples where someone can not use this code? Unless this is blocking someone we have other more in-demand things to get on with, and we should not be modifying this input or code. If we do have an example or two of something that can not be done, then we plan a route to improve the manual, e.g. by including a working example for the problematic cases. Perhaps there are other recently used examples that can be added -- for sure regular and spherical grids are used. This might already be clarifying enough.

Probably best to have a discussion with the original authors on what was intended etc. and try to follow/fix the logic and code to match if that is the bottleneck. @ye-luo Were you looking at this because of #5212 or for other reasons?

ye-luo commented 13 hours ago

Since estimators are re-implemented from scratch, I feel it is chance to revisit and probably minimizing supported features.

@prckent yes. When reviewing #5212, I feel the need to understand what this feature is and found the manual difficult to follow. I raised above questions to figure out what is the intention of the original author. With better understand, we may find ways to cut corners when consolidating the implementation of new estimators. If we can settle a few core examples and write-up the manual, implementation can be added quickly.

PDoakORNL commented 3 hours ago

I didn't re-implement all this grid input parsing and handling I just ported and refactored it. I wrote some unit tests for it since it had none. I can't say I understand all the ins and outs the input. I just tried to the best of my ability to organize check for obviously invalid input and not degraded the legacy capabilities. @jtkrogel and the users currently getting started using this should get consulted on what functionality is really core.

I think this generally a better grid input and implementation than the grids in SpinDensity and One bodies uniform grid integration. But the input is clearly more work if all you want is just a uniform grid over the lattice cell. Still I think the verbose representation is better since it could be used uniformly and nexus is going to be emitting it anyway. So we don't get anything out of being able to write

<grid>10 10 10<grid>

instead of

<spacegrid coord="cartesian">
     <origin p1="zero"/>
     <axis p1="a1" label="x" grid="-0.5 (10) 0.5"/>
     <axis p1="a2" label="y" grid="-0.5 (10) 0.5"/>
     <axis p1="a3" label="z" grid="-0.5 (10) 0.5"/>
 </spacegrid>

Although maybe it is `grid="0 (10) 1.0".

Both SpinDensity and MagneticDensity use grid, dr, corner, center to specify the grid usedwhich I'm not fond of though since they scatter the grid spec across four loose tags in the estimator. OneBodyDensityMatrix usesscale, center, points` some to none which are used depending on the integrator.

I think its a useful capability to be able to accumulate on something other than a cartesian grid, its certainly convenient to have a set of reference points available. Unfortunately the notation for all this isn't 100% consistent and is different across fields so I think the documentation would really benefit from clarity. I think there are some conceptual collisions with respect to p1,p2,scaling that makes the whole thing extra confusing. But its not a trivial job to figure out the corner cases or document clearly. I would certainly take a pass over the documentation pre 4.0.0 to add what I've learned about this but I think the amount of time we spend on it needs to be carefully controlled. (looking at this all again and writing this cost 45 minutes)