theochem / AtomDB

An Extended Periodic Table of Neutral and Charged Atomic Species
http://atomdb.qcdevs.org/
GNU General Public License v3.0
16 stars 13 forks source link

[Release] Update RUN/COMPILE functions for the datasets #18

Open gabrielasd opened 8 months ago

gabrielasd commented 8 months ago

A) The "run" functions for each dataset (esp. HCI) should be checked and made up to date, and one should be able to run it as a script on ComputeCanada. The "compile" functions should be also kept up to date, with all of the available properties computed from the raw data.

B) Finally, after the API and list of properties is finalized, and before release, all of the currently available datasets should be run and compiled, and the .msg files included in the Github repo, and in the library itself.

gabrielasd commented 8 months ago

This was issue 37 ported from the QuantumElephant repo.

Part B) of this commit message overlaps with issues #8

@msricher, for part A) my question is whether this is a feature for the current version of atomdb, where the compile and run functions are sort of the same thing, or for the future version on the GSoC proposal? In specific the part about being able to sun as script in compute Canada. My confusion comes because I remember that initially, for the HCI database, we had a script function (run) that ran the jobs in compute Canada, and then a compile one that processed the data, or something along those lines. While now, there is only the task of processing the raw data where the compile function calls the run function. At some point we removed the requirement that atomdb ran the jobs to generate raw data, I think there were two reasons; one was that the databases project had this purpose, and the other that from the 5 datasets, only the HCI jobs would have this separation of tasks (although we could make the same for the Gaussian ones). And I think there was also the discussion about whether we would keep track of different versions/updates of datasets.