petosa / mongo_qcdb

MongoDB backend for storing quantum chemical databases
0 stars 0 forks source link

Molecule #5

Open loriab opened 7 years ago

loriab commented 7 years ago

Questions for @dgasmith,

Changes for @petosa,

General

Below are json for a couple molecules. The hashes won't match until some of the above questions get resolved. Then I'll regenerate then, and we can try this out.

JSONEncoded S22-16-monoB-CP as 3314127543f6297942311fef82fc368747700ed8
{"comment": "Ethine from Ethene-Ethine Complex", "ghost": [true, true, true, true, true, true, false, false, false, false], "name": "S22-16-monoB-CP", "geometry": [[3.86235105939066e-17, -1.2615395923395314, -4.021632563049252], [-3.86235105939066e-17, 1.2615395923395314, -4.021632563049252], [1.7453907405819704, -2.328620696426732, -4.024516285128035], [-1.7453907405819702, -2.328620696426732, -4.024516285128035], [-1.7453907405819704, 2.328620696426732, -4.024516285128035], [1.7453907405819702, 2.328620696426732, -4.024516285128035], [0.0, 0.0, 5.4745473903346324], [0.0, 0.0, 3.193150949968712], [0.0, 0.0, 1.1789145416394995], [0.0, 0.0, 7.48413129292468]], "multiplicity": 1, "masses": [12, 12, 1.00782503207, 1.00782503207, 1.00782503207, 1.00782503207, 12, 12, 1.00782503207, 1.00782503207], "symbols": ["C", "C", "H", "H", "H", "H", "C", "C", "H", "H"], "charge": 0, "fragment_multiplicities": [1, 1], "fragments": [[1, 2, 3, 4, 5, 6], [7, 8, 9, 10]], "fragment_charges": [0, 0]}

JSONEncoded S22-16-monoB-unCP as bad99347a34c0715d3fdd9dc1e83e9eac365acf3
{"comment": "Ethine from Ethene-Ethine Complex", "ghost": [false, false, false, false], "name": "S22-16-monoB-unCP", "geometry": [[0.0, 0.0, 1.140878454454801], [0.0, 0.0, -1.140517985911119], [0.0, 0.0, -3.1547543942403315], [0.0, 0.0, 3.150462357044849]], "multiplicity": 1, "masses": [12, 12, 1.00782503207, 1.00782503207], "symbols": ["C", "C", "H", "H"], "charge": 0, "fragment_multiplicities": [1], "fragments": [[1, 2, 3, 4]], "fragment_charges": [0]}
dgasmith commented 7 years ago

1) "real" is fine. 2) Im very much in favor of 0-indexing to align with most Python/C++ array operations. 3) Agreed, mongo_qcdb should not know about "Absent" atoms/

Im partial to snake case my self. i_really_dont_like_camel_case

Cheers, Daniel Smith

On Dec 13, 2016, at 18:59, Lori A. Burns notifications@github.com wrote:

Questions for @dgasmith https://github.com/dgasmith,

When the natural/common state of an atom is real, what to you say to changing"ghost": [false, false, true] to "real": [true, true, false]? or "isreal" or "corporeal"? Right now fragments are 1-indexed. e.g., (CO_2)_2 is "fragments": [[1, 2, 3], [4, 5, 6]]. I'm ok with that (and prefer it for user-facing). But shall we decide now whether the mongoDB layer is user-facing/1-indexed or internal-facing/0-indexed. I slightly favor the latter. There's a third state for fragments in Psi4 Molecules: Absent. As in molpart = trimer.extract_subsets(1, 3) has fragment states ['Real', 'Absent', 'Ghost']. Such absent atoms are currently filtered out of the geometries, fragments, etc. before they hit mongo_qcdb. Agreed to remain on that course? Changes for @petosa https://github.com/petosa,

Need a field "fragment_charges" that is a list of floats of length length-of-"fragments" Ditto "fragment_multiplicities" Add those fields to the molecule hash General

Convention for multiword keys like the above "fragment_charges"? This is our first. Stick with snake_case? Below are json for a couple molecules. The hashes won't match until some of the above questions get resolved. Then I'll regenerate then, and we can try this out.

JSONEncoded S22-16-monoB-CP as 3314127543f6297942311fef82fc368747700ed8 {"comment": "Ethine from Ethene-Ethine Complex", "ghost": [true, true, true, true, true, true, false, false, false, false], "name": "S22-16-monoB-CP", "geometry": [[3.86235105939066e-17, -1.2615395923395314, -4.021632563049252], [-3.86235105939066e-17, 1.2615395923395314, -4.021632563049252], [1.7453907405819704, -2.328620696426732, -4.024516285128035], [-1.7453907405819702, -2.328620696426732, -4.024516285128035], [-1.7453907405819704, 2.328620696426732, -4.024516285128035], [1.7453907405819702, 2.328620696426732, -4.024516285128035], [0.0, 0.0, 5.4745473903346324], [0.0, 0.0, 3.193150949968712], [0.0, 0.0, 1.1789145416394995], [0.0, 0.0, 7.48413129292468]], "multiplicity": 1, "masses": [12, 12, 1.00782503207, 1.00782503207, 1.00782503207, 1.00782503207, 12, 12, 1.00782503207, 1.00782503207], "symbols": ["C", "C", "H", "H", "H", "H", "C", "C", "H", "H"], "charge": 0, "fragment_multiplicities": [1, 1], "fragments": [[1, 2, 3, 4, 5, 6], [7, 8, 9, 10]], "fragment_charges": [0, 0]}

JSONEncoded S22-16-monoB-unCP as bad99347a34c0715d3fdd9dc1e83e9eac365acf3 {"comment": "Ethine from Ethene-Ethine Complex", "ghost": [false, false, false, false], "name": "S22-16-monoB-unCP", "geometry": [[0.0, 0.0, 1.140878454454801], [0.0, 0.0, -1.140517985911119], [0.0, 0.0, -3.1547543942403315], [0.0, 0.0, 3.150462357044849]], "multiplicity": 1, "masses": [12, 12, 1.00782503207, 1.00782503207], "symbols": ["C", "C", "H", "H"], "charge": 0, "fragment_multiplicities": [1], "fragments": [[1, 2, 3, 4]], "fragment_charges": [0]} — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/petosa/mongo_qcdb/issues/5, or mute the thread https://github.com/notifications/unsubscribe-auth/ABsBcX5w50Mc4GbIOomt7pcRRu38DyJ1ks5rHzFlgaJpZM4LMYGu.

loriab commented 7 years ago

Ok, here's json and hash from above decisions.

JSONEncoded S22-16-monoB-CP as ee8a945588420d82a4d14e677ee16af39bbe782d
{"comment": "Ethine from Ethene-Ethine Complex", "real": [false, false, false, false, false, false, true, true, true, true], "name": "S22-16-monoB-CP", "geometry": [[3.86235105939066e-17, -1.2615395923395314, -4.021632563049252], [-3.86235105939066e-17, 1.2615395923395314, -4.021632563049252], [1.7453907405819704, -2.328620696426732, -4.024516285128035], [-1.7453907405819702, -2.328620696426732, -4.024516285128035], [-1.7453907405819704, 2.328620696426732, -4.024516285128035], [1.7453907405819702, 2.328620696426732, -4.024516285128035], [0.0, 0.0, 5.4745473903346324], [0.0, 0.0, 3.193150949968712], [0.0, 0.0, 1.1789145416394995], [0.0, 0.0, 7.48413129292468]], "multiplicity": 1, "masses": [12, 12, 1.00782503207, 1.00782503207, 1.00782503207, 1.00782503207, 12, 12, 1.00782503207, 1.00782503207], "symbols": ["C", "C", "H", "H", "H", "H", "C", "C", "H", "H"], "charge": 0, "fragment_multiplicities": [1, 1], "fragments": [[0, 1, 2, 3, 4, 5], [6, 7, 8, 9]], "fragment_charges": [0, 0]}

JSONEncoded S22-16-monoB-unCP as 5706ace4ec052046fc7038a7b9f074f61967c593
{"comment": "Ethine from Ethene-Ethine Complex", "real": [true, true, true, true], "name": "S22-16-monoB-unCP", "geometry": [[0.0, 0.0, 1.140878454454801], [0.0, 0.0, -1.140517985911119], [0.0, 0.0, -3.1547543942403315], [0.0, 0.0, 3.150462357044849]], "multiplicity": 1, "masses": [12, 12, 1.00782503207, 1.00782503207], "symbols": ["C", "C", "H", "H"], "charge": 0, "fragment_multiplicities": [1], "fragments": [[0, 1, 2, 3]], "fragment_charges": [0]}
petosa commented 7 years ago

08b6d15b3831bbe9170b0c9bc9e9c0291ab98912 Added those two fields to the molecule json and to its hash calculation. (Unrelated: am I supposed to add provenance to the databases json?)

loriab commented 7 years ago

am I supposed to add provenance to the databases json?)

yes, please