The calculator, its CLI parsing handler is almost fully rewritten, with a bunch of new abstractions

The original intention of this branch is to refactor calculator.py, to achive a clearer style. However, the change grow big, they are introduced as follows:

Using 3 sets of classes for 3 stages of calculations

The calculation of QHA comprises of 3 stages:

The calculation of Helmholtz free energy, results in F(T, V)
Refine the grid, and achive a set of (T, V) functions
The conversion and calculation of addition (T, P) fields

However, the original calculation is mixed in a huge class, using {PROPERTY}_{TV/TP}_{UNIT} style to seperate different properties. What's more, the original code use inheritance for single configuration and multiple configuration calculations, which results in that properties that have same name in class and its sub class could have different dimensions, which is really error prone.

To address this issue, I propose that we could use 3 wrappers to represent each stage of calculation:

HelmholtzFreeEnergyCalculator for Helmholtz free energy calculation, which has 3 sub classes
- SingleConfigurationHelmholtzFreeEnergyCalculator for single configuration
- DiversePhDOSHolmholtzFreeEnergyCalculator for multiple configurations with diverse phonon DOS
- IdenticalPhDOSHolmholtzFreeEnergyCalculator for multiple configurations with identical phonon DOS, as a sub class of DiversePhDOSHolmholtzFreeEnergyCalculator
TemperatureVolumeFieldCalculator for second stage, the calculation of (T, V) field
TemperaturePressureFieldAdapter for the third stage, the calculation of (T, P) field

All these classes has a similar set of property API, namely

All thermo function fields are named as its plural form, e.g. helmholtz_free_energies for Helmholtz free energy field, and isothermal_bulk_moduli for isothermal bulk modulus field ($\beta_t$)
All 1D arrays are named XXX_array where XXX is singular form, e.g. volume_array, pressure_array and temperature_array, original disired_pressure properties are renamed this way.

I have also removed the sampled pressure and sampled temperature because this is actually for output, this should not be mixed in the calculation code. These are implemented in the TP/TV field writter mentioned in latter sections.

Use classes for handling input reading

To address the issue of sub class could have different dimensions, I propose to use classes for clarify the abstraction. Therefore I introduce StructureConfiguration for each configuration and PressureSpecificData for each pressure under each configuration. Therefore,

Each single configuration Helmholtz free energy calculator store one StructureConfiguration object
Each multiple configuration Helmholtz free energy calculator store a list of StructureConfiguration objects
Each StructureConfiguration object stores a set of PressureSpecificData objects.

Using field writers to handle TP/TV field writing

Because we having the above abstractions, we are able to wrap the TP and TV field writing process in several classes, with accept the calculator instance only, and write to file with given unit, variable name, filename and sample rate.

These are the ResultsWriter, as the parent class, FieldResultsWriter to handle field properties, TVFieldResultsWriter and TPFieldResultsWriter for TV and TP fields respectively.

Using the `Pint` package for unit handling

All properties mentioned above are now wrapped in the the Pint unit wrapper. To get the original value, one needs to invoke .magnitude. The use of Pint could be a little wierd, because I actually use a Singleton design pattern because Pint allows only unit comparison within the same instance of UnitRegistry. This is implemented in the QHAUnit.

Internally, the units are base on Rydberg / bohr system, so there should be no need for unit conversion, getting the magnitude of data could be achieve by invoking .magnitude property.

When doing output, one needs to specify the unit by calling .to("m") or .to(units.m) first then call .magnitude to do the unit conversion first.

Pint by default does not define the Rydberg unit, so I tried to define them. There are two ways of defining it: in terms of eV and in terms of J, the achive different results (as is discussed in the following sections). But, because the original precision is not very high, since that Pint follows the standard given by NIST, I also need to define eV of J. We should double check the precision, because it is either high precision or maintain a consistent presicion. There is still space for discussion.

`PerFormulaUnit` and `PerMole` for per formula unit and per mole properties

In original code, heat capacities have J/(mol·K) unit, however, other energies could be calculate in a per mole or per formula unit manner. So it doesn't make sense to give only the heat capacities have this previlage, I therfore propose to use a PerFormulaUnit and PerMole for per formula unit and per mole properties.

Performance

I have not observed significant increase in calculation time after adding the abstraction layers. It could even be speeding up a little bit.

Difference in results

The difference in result is cause by the differnce in defination of Rydberg unit. For the calculation of the silicon example:

When the Rydberg unit is defined interms of electron volt:
- $F$, $G$, $H$, $U$, $\alpha$ remains the same
- $\beta_T$ and $C_V$, $p$, $V$ could remains the same in the first seven digits
- $C_p$, $\beta_s$, $\gamma$ remains the same in the first five digits.
If we define Rydberg in terms of joule:
- $G$, $H$, $U$, $\alpha$, $\beta_T$ and $C_V$, $p$, $V$ remains the same
- $F$, could remains the same in the first seven digits
- $C_p$, $\beta_s$, $\gamma$ remains the same in the first five digits.

I guess the results are acceptable.

However, for Ice VII example, $C_p$, $\beta_s$, $\gamma$ varies too much, other stuff remains OK. You should definitely try these out.

Change in `settings.yaml`

I propose that output filename now should be specified in settings.yaml to allow more flexibility. Units of output could be specified alongside.

Further work

We could rewrite plot part of the program.
Validation part of code is implemented, but should be double checked.
Units conversion, precision?
Comments and docline

MineralsCloud / qha