duartegroup / autodE

automated reaction profile generation
https://duartegroup.github.io/autodE/
MIT License
173 stars 52 forks source link

Minor improvements for generation of dataset of reaction profiles. #359

Closed javialra97 closed 2 weeks ago

javialra97 commented 1 month ago

Checklist

t-young31 commented 1 month ago

Thanks for the PR @javialra97

Reading the output.csv as a data frame (with pandas) can be problematic because of the first line, removing the line you will have a proper data frame with well-structured columns. Also, that information is printed in the methods.txt file.

This would be a breaking change, and pandas supports skipping rows natively

import pandas as pd
pd.read_csv("<filename>", skiprows=1)

so I'm not sure about that.

For specific purposes, it is not necessary a final single point, add it as a boolean

Interesting. I wouldn't be that adverse to this functionality, but can't think of a scenario where this would be useful. What did you have in mind?

The convergence criterion for the optimization with Gaussian should be set to Tight.

On this I don't agree I'm afraid. I've not come across a situation where tight optimisations are worthwhile. Maybe if you're wanting super accurate frequencies and you've got a load of compute to burn then maybe, in my view.

Implemented bug https://github.com/duartegroup/autodE/issues/358.

Thanks! πŸ˜„

javialra97 commented 1 month ago

Hi @t-young31

Interesting. I wouldn't be that adverse to this functionality, but can't think of a scenario where this would be useful. What did you have in mind?

I think the simplest case is when you have the resources to compute the full profile at a TZ/QZ basis set, in those cases a final single point is unnecessary. I have used it for benchmarking purposes, I want the geometries (with all the autodE checks) along the full profile to compute single points with N different functionals.

On this I don't agree I'm afraid. I've not come across a situation where tight optimisations are worthwhile. Maybe if you're wanting super accurate frequencies and you've got a load of compute to burn then maybe, in my view.

Well, I would prefer it to use the tight criterion, and I couldn't configure it with ade.Config, I will give it another try.

Thanks! πŸ˜„

Thanks to you for autodE!

t-young31 commented 3 weeks ago

Sorry for the slow reply – work took over last week

I think the simplest case is when you have the resources to compute the full profile at a TZ/QZ basis set, in those cases a final single point is unnecessary.

Cool – sounds good πŸ‘πŸΌ

Well, I would prefer it to use the tight criterion, and I couldn't configure it with ade.Config, I will give it another try.

Hopefully the second try was successful! The following I think should work

import autode as ade

ade.Config.G09.keywords.opt.remove('Opt')
ade.Config.G09.keywords.opt.append('Opt=Tight')
# likewise for ade.Config.G09.keywords.opt_ts

To get this merged would you mind

Thanks πŸ˜„

javialra97 commented 3 weeks ago

Hi,

I made the pertinent changes. Hope everything is fine

Thanks!

t-young31 commented 3 weeks ago

Awesome – thanks @javialra97 πŸ˜„

It looks like there's a conflict on your branch, but looks like an easy one to resolve

javialra97 commented 3 weeks ago

Hi @t-young31,

I hope that is solved now πŸ™‚

t-young31 commented 2 weeks ago

Looks like some Guassian frequency calculation tests are failing due to allowing the Standard orientation coordinates to be extracted. I've not had time to dig into it but I'm guessing it's a limitation of the frequency calculation from the Hessian, making it unstable when the principal axis lies along one of the cartesian axes (I guess that's 'standard orientation'?). A quick fix might be to do

            if "Input orientation" in line or ("Standard orientation" in line and len(coords) == 0):

but we should fix this properly so that frequency calculation is okay irrespective of the orientation

javialra97 commented 2 weeks ago

Hi @t-young31,

In the Standard orientation, Gaussian reorders the molecule so the principal axis will be the z-axis and the principal plane of symmetry is located on the yz-plane. In the first test with the H2 molecule, the h2.hessian.atoms will look like as Atoms(n_atoms=2, [Atom(H, 0.0000, 0.0000, 0.3804), Atom(H, 0.0000, 0.0000, -0.3804)]) and not in the x-axis as is defined in the input.

As you said, the problem is in the hessian, the output of _tr_vecs() returns different vectors and consequently the _proj_mass_weighted has completely different values. Honestly, I can't help you any further at this point. I would suggest to rise a warning. Do the thermal corrections depend on that, because the extracted frequencies are fine?

t-young31 commented 2 weeks ago

Honestly, I can't help you any further at this point

Sorry you hit this 😞. Have created #362 to track outside of this PR

Do the thermal corrections depend on that, because the extracted frequencies are fine?

I'm not sure the frequencies are fine.. https://github.com/duartegroup/autodE/actions/runs/11589659415/job/32265741467#step:6:205

codecov[bot] commented 2 weeks ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 97.43%. Comparing base (8284a7c) to head (52aa45a). Report is 1 commits behind head on v1.4.5.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## v1.4.5 #359 +/- ## ======================================= Coverage 97.43% 97.43% ======================================= Files 204 204 Lines 23758 23759 +1 ======================================= + Hits 23148 23150 +2 + Misses 610 609 -1 ``` | [Flag](https://app.codecov.io/gh/duartegroup/autodE/pull/359/flags?src=pr&el=flags&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=duartegroup) | Coverage Ξ” | | |---|---|---| | [unittests](https://app.codecov.io/gh/duartegroup/autodE/pull/359/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=duartegroup) | `97.43% <100.00%> (+<0.01%)` | :arrow_up: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=duartegroup#carryforward-flags-in-the-pull-request-comment) to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

javialra97 commented 2 weeks ago

I'm not sure the frequencies are fine.. https://github.com/duartegroup/autodE/actions/runs/11589659415/job/32265741467#step:6:205

Yes, is true, I only checked the one for H2 but the acetylene frequencies are wrong πŸ˜”

Maybe instead of taking the geometries from the standard orientation, we can take them directly from the input instructions, after charge and multiplicity, what do you think?

The error now is also because of taking a different geometry?

t-young31 commented 2 weeks ago

Maybe instead of taking the geometries from the standard orientation, we can take them directly from the input instructions, after charge and multiplicity, what do you think?

We could, but then it'd still potentially be broken for Opt+Freq calculations that don't print the input orientation coordinates. I'll find a way to fix it properly, or at least try!

The error now is also because of taking a different geometry?

I don't think so. @shoubhikraj have you got any ideas? it's Windows+TRM so your world. Python 3.8 looks like it's just end of life, so shall we just drop support? @javialra97 very happy to merge irrespective of that failing test, if you're happy πŸ˜„

javialra97 commented 2 weeks ago

We could, but then it'd still potentially be broken for Opt+Freq calculations that don't print the input orientation coordinates. I'll find a way to fix it properly, or at least try!

It seems that Gaussian is a little messy with the outputs. But I was checking my outputs and the one that you have in test and for the Opt+Freq of the TS it prints it

@javialra97 very happy to merge irrespective of that failing test, if you're happy πŸ˜„

Great! Glad to help a little bit.

shoubhikraj commented 2 weeks ago

@t-young31 Honestly I am not sure where the error is coming from... The hessian and gradient are loaded from stored matrices, and I have previously checked that it works. It does succeed with python 3.13, so maybe some issue with numpy or scipy versions. If it comes up later, then I might check in detail. I am fine with removing python 3.8 support, maybe in the next version - it was just made eol.