Quantum-Accelerators / quacc

quacc is a flexible platform for computational materials science and quantum chemistry that is built for the big data era.
https://quantum-accelerators.github.io/quacc/
BSD 3-Clause "New" or "Revised" License
176 stars 48 forks source link

[Proposal] store a unique hash to identify calculations #2409

Open tomdemeyere opened 2 months ago

tomdemeyere commented 2 months ago

What new feature would you like to see?

This aim to implement a unique hash that could be used to identify calculations. The hash should depends on things like:

  1. The job being ran.
  2. Rounded attributes of the Atoms object.
  3. All the parameters used to run the calculations.

The hash would then be stored in the results dict. The idea is that users should be able to do things like this


@subflow
run_all_calculations(atoms_list):
    for atoms in atoms_list:
         if unique_hash(my_job, atoms, parameters) not in already_done:
              my_job(atoms, parameters)...

This is based on discussion from #2399

Andrew-S-Rosen commented 2 months ago

I think this is an interesting idea (even though workflow tools often have similar hashing).

We already have hashing of the Atoms object. It would be possible to do the same for a collection of entries in the DB that define the job (e.g. parameters).