nomad-coe / electronic-parsers

Apache License 2.0
18 stars 7 forks source link

Fix gaussian basis set #223

Closed ladinesa closed 1 month ago

ladinesa commented 1 month ago

A parsing error for gaussian has been reported for this upload . The basis set cannot be resolved so I simply set it to None, but @ndaelman-hu you should know more.

The method normalizer also fails so I put up a fix in nomad.

coveralls commented 1 month ago

Pull Request Test Coverage Report for Build 9285765274

Details


Changes Missing Coverage Covered Lines Changed/Added Lines %
electronicparsers/gaussian/parser.py 6 7 85.71%
<!-- Total: 6 7 85.71% -->
Totals Coverage Status
Change from base Build 9252035458: 0.0%
Covered Lines: 35712
Relevant Lines: 38384

💛 - Coveralls
JosePizarro3 commented 1 month ago

Hi there,

I cannot see this upload, but is this the last one by Akseli Mansikkamäki (https://nomad-lab.eu/prod/v1/gui/upload/id/MASTxxsySg-SkFffP9j2Zw)? Do you know details on the data? I think he is studying magnetism in charged molecules, but I'd like to make sure.

ndaelman-hu commented 1 month ago

Hi there,

I cannot see this upload, but is this the last one by Akseli Mansikkamäki (https://nomad-lab.eu/prod/v1/gui/upload/id/MASTxxsySg-SkFffP9j2Zw)? Do you know details on the data? I think he is studying magnetism in charged molecules, but I'd like to make sure.

Neither I can see the upload. @ladinesa Could you ask permissions to share the raw files or have us included?

ndaelman-hu commented 1 month ago

@ladinesa Okay, I got access to the failing upload. You made the right call: there simply isn't any basis set mentioned. Likely, this is due to:

  1. this calculation starting from a previous SCF (not uploaded).
  2. this is a frequency calculation, with a different emphasis.

Still, this is pretty bad practice on Gaussian's side, as SCF routines shouldn't tacitly assume the basis set. Fro the long-term, I'll see if I can hunt down their default, though Gaussian is likely to have some complex logic in assigning it.

ndaelman-hu commented 1 month ago

@JosePizarro3 Regarding the upload you linked, all the Gaussian files parsed well there. The issue seems to be with Orca, as we don't support CASSCF, yet. I have this on the planned list, but quite low.

We either add a schema there, or we harden the parsing not to fail for unrecognized methods.

ndaelman-hu commented 1 month ago

Digging further, the default is STO-3G. I can see to improve on the current correction then.

ladinesa commented 1 month ago

@ndaelman there is a problematic test. is it possible for you to look at it so we can merge it already? thanks

ndaelman-hu commented 1 month ago

So, some final comments here that will lead to new issues:

  1. we need to add missing basis set specs to our tests
  2. we will have to extend "GEN" (or "Gen") parsing, as this indicates a user-defined basis set