The original intention of this branch is to refactor calculator.py, to achive a clearer style. However, the change grow big, they are introduced as follows:
Using 3 sets of classes for 3 stages of calculations
The calculation of QHA comprises of 3 stages:
The calculation of Helmholtz free energy, results in F(T, V)
Refine the grid, and achive a set of (T, V) functions
The conversion and calculation of addition (T, P) fields
However, the original calculation is mixed in a huge class, using {PROPERTY}_{TV/TP}_{UNIT} style to seperate different properties. What's more, the original code use inheritance for single configuration and multiple configuration calculations, which results in that properties that have same name in class and its sub class could have different dimensions, which is really error prone.
To address this issue, I propose that we could use 3 wrappers to represent each stage of calculation:
HelmholtzFreeEnergyCalculator for Helmholtz free energy calculation, which has 3 sub classes
SingleConfigurationHelmholtzFreeEnergyCalculator for single configuration
DiversePhDOSHolmholtzFreeEnergyCalculator for multiple configurations with diverse phonon DOS
IdenticalPhDOSHolmholtzFreeEnergyCalculator for multiple configurations with identical phonon DOS, as a sub class of DiversePhDOSHolmholtzFreeEnergyCalculator
TemperatureVolumeFieldCalculator for second stage, the calculation of (T, V) field
TemperaturePressureFieldAdapter for the third stage, the calculation of (T, P) field
All these classes has a similar set of property API, namely
All thermo function fields are named as its plural form, e.g. helmholtz_free_energies for Helmholtz free energy field, and isothermal_bulk_moduli for isothermal bulk modulus field ($\beta_t$)
All 1D arrays are named XXX_array where XXX is singular form, e.g. volume_array, pressure_array and temperature_array, original disired_pressure properties are renamed this way.
I have also removed the sampled pressure and sampled temperature because this is actually for output, this should not be mixed in the calculation code. These are implemented in the TP/TV field writter mentioned in latter sections.
Use classes for handling input reading
To address the issue of sub class could have different dimensions, I propose to use classes for clarify the abstraction. Therefore I introduce StructureConfiguration for each configuration and PressureSpecificData for each pressure under each configuration. Therefore,
Each single configuration Helmholtz free energy calculator store one StructureConfiguration object
Each multiple configuration Helmholtz free energy calculator store a list of StructureConfiguration objects
Each StructureConfiguration object stores a set of PressureSpecificData objects.
Using field writers to handle TP/TV field writing
Because we having the above abstractions, we are able to wrap the TP and TV field writing process in several classes, with accept the calculator instance only, and write to file with given unit, variable name, filename and sample rate.
These are the ResultsWriter, as the parent class, FieldResultsWriter to handle field properties, TVFieldResultsWriter and TPFieldResultsWriter for TV and TP fields respectively.
Using the Pint package for unit handling
All properties mentioned above are now wrapped in the the Pint unit wrapper. To get the original value, one needs to invoke .magnitude. The use of Pint could be a little wierd, because I actually use a Singleton design pattern because Pint allows only unit comparison within the same instance of UnitRegistry. This is implemented in the QHAUnit.
Internally, the units are base on Rydberg / bohr system, so there should be no need for unit conversion, getting the magnitude of data could be achieve by invoking .magnitude property.
When doing output, one needs to specify the unit by calling .to("m") or .to(units.m) first then call .magnitude to do the unit conversion first.
Pint by default does not define the Rydberg unit, so I tried to define them. There are two ways of defining it: in terms of eV and in terms of J, the achive different results (as is discussed in the following sections). But, because the original precision is not very high, since that Pint follows the standard given by NIST, I also need to define eV of J. We should double check the precision, because it is either high precision or maintain a consistent presicion. There is still space for discussion.
PerFormulaUnit and PerMole for per formula unit and per mole properties
In original code, heat capacities have J/(mol·K) unit, however, other energies could be calculate in a per mole or per formula unit manner. So it doesn't make sense to give only the heat capacities have this previlage, I therfore propose to use a PerFormulaUnit and PerMole for per formula unit and per mole properties.
Performance
I have not observed significant increase in calculation time after adding the abstraction layers. It could even be speeding up a little bit.
Difference in results
The difference in result is cause by the differnce in defination of Rydberg unit. For the calculation of the silicon example:
When the Rydberg unit is defined interms of electron volt:
$F$, $G$, $H$, $U$, $\alpha$ remains the same
$\beta_T$ and $C_V$, $p$, $V$ could remains the same in the first seven digits
$C_p$, $\beta_s$, $\gamma$ remains the same in the first five digits.
If we define Rydberg in terms of joule:
$G$, $H$, $U$, $\alpha$, $\beta_T$ and $C_V$, $p$, $V$ remains the same
$F$, could remains the same in the first seven digits
$C_p$, $\beta_s$, $\gamma$ remains the same in the first five digits.
I guess the results are acceptable.
However, for Ice VII example, $C_p$, $\beta_s$, $\gamma$ varies too much, other stuff remains OK. You should definitely try these out.
Change in settings.yaml
I propose that output filename now should be specified in settings.yaml to allow more flexibility. Units of output could be specified alongside.
Further work
We could rewrite plot part of the program.
Validation part of code is implemented, but should be double checked.
The original intention of this branch is to refactor
calculator.py
, to achive a clearer style. However, the change grow big, they are introduced as follows:Using 3 sets of classes for 3 stages of calculations
The calculation of QHA comprises of 3 stages:
However, the original calculation is mixed in a huge class, using
{PROPERTY}_{TV/TP}_{UNIT}
style to seperate different properties. What's more, the original code use inheritance for single configuration and multiple configuration calculations, which results in that properties that have same name in class and its sub class could have different dimensions, which is really error prone.To address this issue, I propose that we could use 3 wrappers to represent each stage of calculation:
HelmholtzFreeEnergyCalculator
for Helmholtz free energy calculation, which has 3 sub classesSingleConfigurationHelmholtzFreeEnergyCalculator
for single configurationDiversePhDOSHolmholtzFreeEnergyCalculator
for multiple configurations with diverse phonon DOSIdenticalPhDOSHolmholtzFreeEnergyCalculator
for multiple configurations with identical phonon DOS, as a sub class ofDiversePhDOSHolmholtzFreeEnergyCalculator
TemperatureVolumeFieldCalculator
for second stage, the calculation of (T, V) fieldTemperaturePressureFieldAdapter
for the third stage, the calculation of (T, P) fieldAll these classes has a similar set of property API, namely
helmholtz_free_energies
for Helmholtz free energy field, andisothermal_bulk_moduli
for isothermal bulk modulus field ($\beta_t$)XXX_array
where XXX is singular form, e.g.volume_array
,pressure_array
andtemperature_array
, originaldisired_pressure
properties are renamed this way.I have also removed the sampled pressure and sampled temperature because this is actually for output, this should not be mixed in the calculation code. These are implemented in the TP/TV field writter mentioned in latter sections.
Use classes for handling input reading
To address the issue of sub class could have different dimensions, I propose to use classes for clarify the abstraction. Therefore I introduce
StructureConfiguration
for each configuration andPressureSpecificData
for each pressure under each configuration. Therefore,StructureConfiguration
objectStructureConfiguration
objectsStructureConfiguration
object stores a set ofPressureSpecificData
objects.Using field writers to handle TP/TV field writing
Because we having the above abstractions, we are able to wrap the TP and TV field writing process in several classes, with accept the calculator instance only, and write to file with given unit, variable name, filename and sample rate.
These are the
ResultsWriter
, as the parent class,FieldResultsWriter
to handle field properties,TVFieldResultsWriter
andTPFieldResultsWriter
for TV and TP fields respectively.Using the
Pint
package for unit handlingAll properties mentioned above are now wrapped in the the
Pint
unit wrapper. To get the original value, one needs to invoke.magnitude
. The use of Pint could be a little wierd, because I actually use a Singleton design pattern because Pint allows only unit comparison within the same instance ofUnitRegistry
. This is implemented in theQHAUnit
.Internally, the units are base on Rydberg / bohr system, so there should be no need for unit conversion, getting the magnitude of data could be achieve by invoking
.magnitude
property.When doing output, one needs to specify the unit by calling
.to("m")
or.to(units.m)
first then call.magnitude
to do the unit conversion first.Pint by default does not define the Rydberg unit, so I tried to define them. There are two ways of defining it: in terms of eV and in terms of J, the achive different results (as is discussed in the following sections). But, because the original precision is not very high, since that Pint follows the standard given by NIST, I also need to define eV of J. We should double check the precision, because it is either high precision or maintain a consistent presicion. There is still space for discussion.
PerFormulaUnit
andPerMole
for per formula unit and per mole propertiesIn original code, heat capacities have J/(mol·K) unit, however, other energies could be calculate in a per mole or per formula unit manner. So it doesn't make sense to give only the heat capacities have this previlage, I therfore propose to use a
PerFormulaUnit
andPerMole
for per formula unit and per mole properties.Performance
I have not observed significant increase in calculation time after adding the abstraction layers. It could even be speeding up a little bit.
Difference in results
The difference in result is cause by the differnce in defination of Rydberg unit. For the calculation of the silicon example:
I guess the results are acceptable.
However, for Ice VII example, $C_p$, $\beta_s$, $\gamma$ varies too much, other stuff remains OK. You should definitely try these out.
Change in
settings.yaml
I propose that output filename now should be specified in
settings.yaml
to allow more flexibility. Units of output could be specified alongside.Further work
plot
part of the program.