OpenChemistry / chemicaljson

Development of the Chemical JSON data representation
63 stars 15 forks source link

Addition of benzeneselenol g09 output cjson #1

Closed Schamnad closed 7 years ago

Schamnad commented 8 years ago

I have created a cjson file from the output file of a Gaussian09 opt-freq job of a Benzeneselenol input molecule. All the output data has been parsed using the cclib library script 'ccget'.

stuchalk commented 8 years ago

I like the molecule data in the file but not the CC processing info. Including a reference to where the CC processing info is would be appropriate but not the data itself... Thoughts?

ghutchis commented 8 years ago

Overall, I think this is a good start, but there are definitely some tweaks needed.

cryos commented 8 years ago

The Avogadro CJSON writer outputs some of this data, I will add an example of the output to see where we got to there too. This was used in the web-based work we did last year, and also offered cubes in JSON that were visualized in 3DMol.js

cryos commented 8 years ago

So pull request #2 shows some of our previous work, although I should find a smaller example molecule as it is a little large. This has been used in a functional web application, and was the result of work we did last year

Schamnad commented 8 years ago

Thank your for your comments. The point I forgot to mention was that this benzeneselenol.cjson file was created as a POC for writing the attributes that cclib could parse from the sample benzeneselenol.out file into the cjson format. This was done keeping in mind the GSOC project of integrating the cclib with Avogadro, where the data would flow from cclib to Avogadro. The same benzeneselenol.out file was used to write the examples documentation of the cclib.github.io repo.

The workflow followed was as follows: Output file -> cclib -> find the different attributes that cclib can parse -> parse the data using ccget -> write the cjson file manually.

Finding the different attributes that can be parsed is possible by using the ccwrite script in the cclib.

This list indicates the available attributes that have been added and omitted :

atomcharges - done atomcoords - done atomnos - done charge - done coreelectrons - Not enthalpy - done geotargets - not geovalues - not grads - not homos - done moenergies - not mosyms - not mult - done natom - done nbasis - not nmo - not optdone - done optstatus - done scfenergies - done scftargets - not temperature - done vibdisps - not vibfreqs - done vibirs - not vibsyms - done

As seen from the above list, I haven't included all the properties that are available. The ccwrite scipt of cclib isn't capable to write all the attributes that are available, so I parsed them using the ccget script and manually made the cjson format. The names of the properties, for e.g. : "atomcharges" have been kept the same as the "attribute" used to parse these data, such that there is a 1-1 correspondence between cclib and the cjson format.

If the benzeneselenol.cjson file needs to be rewritten mutually exclusive of cclib, I could update the pull request in such a manner.

cryos commented 8 years ago

I think having a mapping is ideal, we likely wouldn't keep the same key names. The vib prefixed names would quite happily live in a vibrations object for example, with displacements, frequencies, etc below it.

Schamnad commented 8 years ago

I have updated the pull request with a new commit. Features of the commit are

For more information on the data types included, you can look at the descriptions here

cryos commented 7 years ago

Closing this pull request, will concentrate on the second as an example from GSoC.