petosa / mongo_qcdb

MongoDB backend for storing quantum chemical databases
0 stars 0 forks source link

Databases #2

Open dgasmith opened 7 years ago

dgasmith commented 7 years ago

Lets first focus on the interior spec:

    {
      "name": "cool reaction",
      "molecules": ["8e102b34c4441c4b164a7d678591df550c90de74", "dbbacd78247e7b39ee5cb8e78d74423e98639203"],
      "coefficients": [1.0, 1.2]
    },

We will need at least the following:

Lori should we split out methods into methods and basis? I hope we can keep everything in pages completely separate (e.g. no CBS computations there), but im not quite sure thats possible.

petosa commented 7 years ago

Added in 9c46631b36be2fe62bb8be75d763fc80de87a39b

loriab commented 7 years ago

Right now, subsets as strings are members of reactions. Did you have in mind that inversion, @dgasmith , or should subsets be defined directly in database like below (or through hashes of rxns)? Second is something for subset with numerical instead of categorical indexing.

{
"name": "HB",
"reactions": ["1", "3", "22"]
},
{
"name": "equil",
"reactions": {"2": 1.0, "4":1.2}
}
loriab commented 7 years ago

Realized my numerical index is what Daniel was calling attributes. Can leave that unresolved for now.

Maybe "stoichiometry" instead of "coefficients", just to be less generic.

Slightly more complicated coeff/stoich, where we're planning on several molecules being generated at database build time.

"stoichiemetry": {
"default": {"mol1hash": 1, "mol2hash": -1, "mol3hash":-1},
"cp": {"mol1hash": 1, "mol2Mhash": -1, "mol3Mhash": -1},
"sapt": {"mol1hash": 1}
}
petosa commented 7 years ago

@loriab Why not use a vector for each field of stoichiometry? Like below:

"stoichiometry": {
  "default": [1, -1, 1],
  "cp": [1, -1, 1],
  "sapt": [1]
}

--EDIT-- In the mean time, I added replaced coefficients with stoichiometry as per Lori's directions: 33ec7a736bad6859aaeb2200ccc55e852f08c61e

loriab commented 7 years ago

because they don't necessary map to the same molecules: e.g., mol2hash vs mol2Mhash

petosa commented 7 years ago

Ok, in that case I will leave it as is. Just to confirm, the correct spelling is stoichiometry, correct? Also, is the first key of the cp field supposed to be mol1hash or mol1Mhash?

loriab commented 7 years ago

Spelling is correct. I just meant the mol...hash as placeholders, though usually all the stoich options will have the first item the same. Do we need to do something extra for those dicts to be ordered dicts? Or is that a given in json?

petosa commented 7 years ago

Yes, in order to load the data into an ordered dict some work has to be done on the Python side. It's this line from main.py:

data = json.loads(json_data, object_pairs_hook=OrderedDict)

Here, the JSON is loaded as an OrderedDict. There is nothing we need to explicitly say in the JSON to enforce the ordered dict.

loriab commented 7 years ago
petosa commented 7 years ago

Made changes: c90536d6b604dd43dd33208405272c4eca055f06