wolverton-research-group / qmpy

A suite of computational materials science tools.
https://oqmd.org
MIT License
129 stars 45 forks source link

Hotfix/duplicated atoms #110

Closed tachyontraveler closed 3 years ago

tachyontraveler commented 3 years ago

This very long explanation for the bugfix is required here because the bug addressed here have been under the debugging stage for the past 1.5 years. Some of the unusual scenarios mentioned here can be useful for future bugfixes.

Fix for issues with

  1. duplicate atoms being added to the qmpy.calculation.input and qmpy.calculation.ouput structures, and
  2. calculation of stability for new calculations

Issue 1:

Several structures in OQMD had all or few of the atoms included twice in the structure.atoms.

Underlying Problem:

  1. Adding atoms to Structure.atom through Django related object atom_set.add() without rechecking for duplicates in the structure posed the potential for errors. This was called while saving the Structure. It may not have been an issue before with Django 1.6 but behavioral changes for Django.Model in Django 1.8 and 2.2 made it important. In addition to it, the bug wasn't captured while printing len(calculation.input.atoms) during the calculations because of the lazy execution nature of atom_site.add() function. Because, as mentioned before, it is a default related object made with QuerySets and associated functions (See here)

  2. In addition to that, the duplicated atoms had slight differences in coordinate values at their least significant decimal places (~ 6th to 15th decimal places for the situation in consideration here). So the Atom.__eq__ function determined them to be different.

Solution:

  1. Added a duplicate check before atom_set.add() and site.set.add()
  2. Modified Atom.__eq__ function to ignore very small differences on coords (if norm of differences < 1e-4)

Issue 2:

The stability value was computed for each converged static calculation but not saved in the DB.

Underlying Problem:

The stability value was first stored in the DB while calling ps.compute_stabilities. But they were overwritten later by calculation.formation object's stability value created inside calculation.get_formation. These overwriting calls came predominantly while saving the calculation object, which calls for self.formation.save() The difficulties in debugging during the past 1.5 years happened mainly because, the errors in the static calculation scripts in scripts.py did not print any exceptions even while encountering errors. It must've been due to the suppression of error messages on try-except with no error logging/printing actions.

Solution:

  1. Calculation.formation is no longer a class variable. Instead, it's a property whose values can be altered only by calling the related FormationEnergy object explicitly. This does not change the DB schema as the calculation.formation was not written previously as a column within the Calculation table anyway.
  2. FormationEnergy object(s) linked to Calculation is(are) not saved once again when the Calculation.save method is called. It doesn't have to be as the FormationEnergy objects are saved while calling ps.compute_stabilites(save=True)

Other commits

Fix to an issue with parsing output from ssh queries in Python3

tachyontraveler commented 3 years ago

Reformatted all the six files involved in this PR