doyle-lab-ucla / auto-qchem

Auto-QChem is an automated workflow for the generation and storage of DFT calculations for organic molecules.
https://doyle-lab-ucla.github.io/auto-qchem/
GNU General Public License v3.0
88 stars 18 forks source link

how to get all features like your previous paper provided #31

Closed a09358872999 closed 6 months ago

a09358872999 commented 7 months ago

Our lab has Gaussian 16 software. So I have tried to run opt freq TD calculations according to the Gaussian gjf file settings you provided (basically just changing the molecule, the settings remain unchanged).

At present, I have used the gaussian_log_extractor you provided to successfully obtain a semi-finished feature.

I would like to ask, what else needs to be done to convert it to a complete feature dataframe(or csv)?

I want to have the molecule feature like this file: https://github.com/b-shields/edbo/blob/master/experiments/data/aryl_amination/aryl_halide_dft.csv

My Current Result: A dict like below

{
    'descriptors': {'number_of_atoms': 12, 'charge': 0, 'multiplicity': 1......},
    'atom_descriptors': {'X': [-2.241815, -0.719692, 0.0, 1.460496, 2.......},
    'modes': {'Frequencies': [99.7815, 166.9938, 244.151, 347.9461, 393.1......},
    'mode_vectors': {'mode_number': [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,......},
    'transitions': {'ES_transition': [161.05, 154.99, 154.72, 134.54, 132.51, 127.49, 127.4......},
    'labels': ['C', 'C', 'C', 'C', 'N', 'H', 'H', 'H', 'H', 'H', 'H', 'H']
}

Previous paper by your group: Bayesian reaction optimization as a tool for chemical synthesis Corresponding Github: https://github.com/b-shields/edbo

Thank you for your kind assistance.

beef-broccoli commented 6 months ago

What you are asking is more of a pandas question. Lots of ways to do it, but something like this probably: https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.from_dict.html