SEDenmarkLab / molli_firstgen

In silico library generation tool box
4 stars 1 forks source link

Creating a More Robust Orca Driver/Interface for Workstations #25

Open blakeo2 opened 2 years ago

blakeo2 commented 2 years ago

The goal of this is to allow for easy access of Orca on the workstations. There are 4 major changes made.

  1. The first was the creation of a class that handles Orca outputs including the ".out" and ".hess" files (class found in molecule.py).

  2. The second was the creation of a flexible function that submits Orca jobs to the commandline through the creation of a full Orca output file that includes the xyz coordinates (found in orca.py)

  3. The third was the creation of a dataclass within that stores the minimum amount of data from the output file to avoid taking up too much memory, and it can recognize if Orca terminated normally (found in _core.py).

  4. The fourth was the creation of a system for recognizing if a calculation has been completed (i.e. the backup system). This allows for the recognition if an output file has "terminated normally" or if it needs to be resubmitted. This will also successfully recognize if a hessian exists in the folder and if not, the calculation will be resubmitted. This needs additional work as it only checks if the molecule name is in the backup directory, not if the calc type has also been submitted, which could lead to issues if people submit different types of jobs to the same backup folder.

The object returned from the calculation will be a dataclass with the 5 attributes: "out_name" (absolute path to successful out file), "failed" (boolean for if Orca succeeded or not), "calc_type" (most recent type of calculation, may need rework for files pulled from backup), "end_lines" (the final 11 lines of the file as a string), "hess_file_name" (will return None if hess_file not found, otherwise will return absolute path to hess file).

esalx commented 2 years ago

Hardcoding paths that are only relevant for a few computers is not how we should go about it. Moreover since ORCA likes to be called with the full path, that is the argument that we should be using to instantiate an ORCA driver.

blakeo2 commented 2 years ago

Alright I've added the init code for the OrcaDriver and also changed the way orca is called!

blakeo2 commented 2 years ago

blake_orca_test_script.txt

testing_orca_mol2s.zip

blakeo2 commented 2 years ago

I found an issue with what is being written to the .hess files, DO NOT CLOSE THE CURRENT PULL REQUEST YET UNTIL I GIVE IT THE ALL CLEAR

esalx commented 1 year ago

Any news?

blakeo2 commented 1 year ago

I haven't gotten a chance to make sure it writes the .hess file correctly, but otherwise the rest of the changes work, including recognizing the .hess file and the .out file. It will hopefully only take a couple minor changes to get the correct hess file, but it works for everything else. I'll try to fix this up by the end of the week

blakeo2 commented 1 year ago

Alright, I have altered the OrcaJobDescriptor to have the mol_name be saved as a part of it. I have also changed the dtypes init.py and the molli init.py to make the Orca_Out_Recognize class available as "ml.Orca_Out_Recognize" Here are is a preview of how I am utilizing the class

a1_freq_calc_script.py.txt basic_test_mol2s.zip