Closed joegair closed 1 year ago
Yes, all_geom
does not include the last iteration. I think it would be easy enough to change it to have all geometries.
We'd like to do more with the documentation. Whether we change it or keep it how it is, it should be explained better.
The file parsers are not written with compound jobs in mind. These are generally more difficult to parse and figure out what level of theory is being used to compute what. For example, in the job your provided you likely have fr["frequency"]
, even though the frequencies were computed for job 1. We did not want to spend time figuring out how compound job output files should be parsed for every QM software we support. Instead, we wanted to automate these types of jobs in a more software-agnostic way. For example:
from AaronTools.geometry import Geometry
from AaronTools.theory import Theory, OptimizationJob, FrequencyJob
from AaronTools.job_control import SubmitProcess
# different levels of theory to use
theory1 = Theory(
method="M06-2X",
basis="6-31+G**",
job_type=[OptimizationJob(), FrequencyJob()],
processors=4,
memory=4,
)
theory2 = Theory(
method="M06-2X",
basis="6-311+G**",
job_type="energy", # optional, as ORCA calculates energy by default
processors=4,
memory=4,
)
theory3 = Theory(
method="B3LYP",
basis="def2-TZVP",
job_type="energy",
processors=4,
memory=4,
)
mol = Geometry.from_string("water", form="iupac")
# run optimization job and wait for it to finish
mol.write(outfile="opt-freq.inp", theory=theory1)
proc1 = SubmitProcess("opt-freq.inp", 2, 4, 4)
print("submitting opt+freq")
proc1.submit(wait=30)
print("opt+freq done")
# read the output structure and run single points
optimized_mol = Geometry("opt-freq.out")
print("submitting SPs")
optimized_mol.write(outfile="M06-2X_SP.inp", theory=theory2)
proc2 = SubmitProcess("M06-2X_SP.inp", 2, 4, 4)
proc2.submit()
optimized_mol.write(outfile="B3LYP_SP.inp", theory=theory3)
proc3 = SubmitProcess("B3LYP_SP.inp", 2, 4, 4)
proc3.submit()
This will run three separate jobs, one for each level of theory. The last two use the optimized structure from the first. There were plans to make this look less cumbersome, but I'm not sure where that's at.
I think all_geom should include the last geometry.
Thanks so much for providing the example. That helps a lot.
I've added the final structure and data to fr.all_geom
, but I also changed all_geom items from a list to a dictionary. So if you wanted to get all energies and structures, it could look like this:
for item in fr.all_geom:
print(item["data"]["energy"])
geom = Geometry(item["atoms"])
Does
fr.all_geom
access all geometries except the final geometry? The following ORCA job has 5 geometries andfr.all_geom
accesses four of them.This became confusing when parsing outputs of compound jobs, eg
In this example:
It would have helped me if the documentation said something like fr.all_geom returns the attributes of all geometries except the final geometry in the output.
(In the
%compound
job above, this is complicated by the fact that the last three single point energies in the file all correspond different calculations on the same geometry.)