nomad-coe / electronic-parsers

Apache License 2.0
18 stars 7 forks source link

CP2K parser crashes when Kerker mixing is used #228

Open behnle opened 1 month ago

behnle commented 1 month ago

The CP2K parser dies when Kerker type mixing is applied in the following way:

...
      &MIXING T
        ALPHA 0.5
        METHOD KERKER_MIXING
        NPULAY 5
      &END MIXING
...

The resulting SCF iteration printout will then look as follows

...
 SCF WAVEFUNCTION OPTIMIZATION

  Step     Update method      Time    Convergence         Total energy    Change
  ------------------------------------------------------------------------------
     1 NoMix/Diag. 0.50E+00    0.1     1.14126075       -17.0091284464 -1.70E+01
     2 Kerker/Diag.0.50E+00    0.2     0.37692456       -15.9319515815  1.08E+00
     3 Kerker/Diag.0.50E+00    0.2     0.07946669       -17.4521851252 -1.52E+00
     4 Kerker/Diag.0.50E+00    0.2     0.01982280       -17.1238625953  3.28E-01
     5 Kerker/Diag.0.50E+00    0.2     0.00666165       -17.1890345016 -6.52E-02
     6 Kerker/Diag.0.50E+00    0.2     0.00175010       -17.1662289745  2.28E-02
     7 Kerker/Diag.0.50E+00    0.2     0.00105658       -17.1679074106 -1.68E-03
     8 Kerker/Diag.0.50E+00    0.2     0.00058192       -17.1653986064  2.51E-03
     9 Kerker/Diag.0.50E+00    0.2     0.00039267       -17.1648914956  5.07E-04
    10 Kerker/Diag.0.50E+00    0.2     0.00028211       -17.1644076629  4.84E-04
    11 Kerker/Diag.0.50E+00    0.2     0.00022288       -17.1641960167  2.12E-04
...

as opposed to the output with default mixing:

 SCF WAVEFUNCTION OPTIMIZATION

  Step     Update method      Time    Convergence         Total energy    Change
  ------------------------------------------------------------------------------
     1 P_Mix/Diag. 0.40E+00    1.4     1.14126075       -17.0091284464 -1.70E+01
     2 P_Mix/Diag. 0.40E+00    0.4     0.67283851       -17.0722459725 -6.31E-02
     3 P_Mix/Diag. 0.40E+00    0.2     0.40411370       -17.1096833471 -3.74E-02
     4 P_Mix/Diag. 0.40E+00    0.2     0.24038623       -17.1315581070 -2.19E-02
     5 P_Mix/Diag. 0.40E+00    0.2     0.14319378       -17.1444920051 -1.29E-02
     6 P_Mix/Diag. 0.40E+00    0.2     0.08527791       -17.1521913517 -7.70E-03
     7 DIIS/Diag.  0.74E-03    0.2     0.05028819       -17.1567912554 -4.60E-03
     8 DIIS/Diag.  0.21E-04    0.2     0.00008805       -17.1636670847 -6.88E-03
     9 DIIS/Diag.  0.36E-05    0.2     0.00000592       -17.1636670852 -4.92E-10

In essence, the parser crashes on the output using Kerker mixing with

(.pyenv) stefan@localhost:~/NOMAD/electronic-parsers$ nomad parse /home/stefan/sampledata/CP2K/cp2k/H2O_regprint.out
WARNING  nomad.datamodel      2024-06-04T16:34:16 Schema is deprecated, use plugins.
  - nomad.commit: 
  - nomad.deployment: devel
  - nomad.service: cli
  - nomad.version: 1.2.2.dev465+gc6aff391
ERROR    nomad.client         2024-06-04T16:34:21 parsing was not successful
  - exception: Traceback (most recent call last):
      File "/home/stefan/NOMAD/.pyenv/lib/python3.9/site-packages/nomad/parsing/parsers.py", line 215, in run_parser
        parser.parse(mainfile_path, entry_archive, logger=logger, **kwargs)
      File "/home/stefan/NOMAD/.pyenv/lib/python3.9/site-packages/nomad/parsing/parser.py", line 460, in parse
        self.mainfile_parser.parse(mainfile, archive, logger)
      File "/home/stefan/NOMAD/.pyenv/lib/python3.9/site-packages/electronicparsers/cp2k/parser.py", line 2301, in parse
        self.parse_configurations_quickstep()
      File "/home/stefan/NOMAD/.pyenv/lib/python3.9/site-packages/electronicparsers/cp2k/parser.py", line 1953, in parse_configurations_quickstep
        parse_calculations(calculations)
      File "/home/stefan/NOMAD/.pyenv/lib/python3.9/site-packages/electronicparsers/cp2k/parser.py", line 1898, in parse_calculations
        sec_scc = self.parse_scc(scf)
      File "/home/stefan/NOMAD/.pyenv/lib/python3.9/site-packages/electronicparsers/cp2k/parser.py", line 1630, in parse_scc
        sec_scf.time_physical = val + time_initial
      File "/home/stefan/NOMAD/.pyenv/lib/python3.9/site-packages/pint/quantity.py", line 1078, in __add__
        return self._add_sub(other, operator.add)
      File "/home/stefan/NOMAD/.pyenv/lib/python3.9/site-packages/pint/quantity.py", line 115, in wrapped
        return f(self, *args, **kwargs)
      File "/home/stefan/NOMAD/.pyenv/lib/python3.9/site-packages/pint/quantity.py", line 984, in _add_sub
        raise DimensionalityError(self._units, "dimensionless")
    pint.errors.DimensionalityError: Cannot convert from 'second' to 'dimensionless'
  - exception_hash: i3B-ltAn730Yo3wnNtCsYbHTPAPV
  - nomad.client.parser: parsers/cp2k
  - nomad.commit: 
  - nomad.deployment: devel
  - nomad.service: cli
  - nomad.version: 1.2.2.dev465+gc6aff391
Traceback (most recent call last):
  File "/home/stefan/NOMAD/.pyenv/bin/nomad", line 8, in <module>
    sys.exit(run_cli())
  File "/home/stefan/NOMAD/.pyenv/lib/python3.9/site-packages/nomad/cli/cli.py", line 75, in run_cli
    return cli(obj=POPO())  # pylint: disable=E1120,E1123
  File "/home/stefan/NOMAD/.pyenv/lib/python3.9/site-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
  File "/home/stefan/NOMAD/.pyenv/lib/python3.9/site-packages/click/core.py", line 1078, in main
    rv = self.invoke(ctx)
  File "/home/stefan/NOMAD/.pyenv/lib/python3.9/site-packages/click/core.py", line 1688, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/stefan/NOMAD/.pyenv/lib/python3.9/site-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/stefan/LISAPLUS/NOMAD/.pyenv/lib/python3.9/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
  File "/home/stefan/NOMAD/.pyenv/lib/python3.9/site-packages/nomad/cli/parse.py", line 83, in _parse
    entry_archives = parse(mainfile, **kwargs)
  File "/home/stefan/NOMAD/.pyenv/lib/python3.9/site-packages/nomad/client/processing.py", line 66, in parse
    entry_archives = parsers.run_parser(
  File "/home/stefan/NOMAD/.pyenv/lib/python3.9/site-packages/nomad/parsing/parsers.py", line 219, in run_parser
    raise e
  File "/home/stefan/NOMAD/.pyenv/lib/python3.9/site-packages/nomad/parsing/parsers.py", line 215, in run_parser
    parser.parse(mainfile_path, entry_archive, logger=logger, **kwargs)
  File "/home/stefan/NOMAD/.pyenv/lib/python3.9/site-packages/nomad/parsing/parser.py", line 460, in parse
    self.mainfile_parser.parse(mainfile, archive, logger)
  File "/home/stefan/NOMAD/.pyenv/lib/python3.9/site-packages/electronicparsers/cp2k/parser.py", line 2301, in parse
    self.parse_configurations_quickstep()
  File "/home/stefan/NOMAD/.pyenv/lib/python3.9/site-packages/electronicparsers/cp2k/parser.py", line 1953, in parse_configurations_quickstep
    parse_calculations(calculations)
  File "/home/stefan/LISAPLUS/NOMAD/.pyenv/lib/python3.9/site-packages/electronicparsers/cp2k/parser.py", line 1898, in parse_calculations
    sec_scc = self.parse_scc(scf)
  File "/home/stefan/NOMAD/.pyenv/lib/python3.9/site-packages/electronicparsers/cp2k/parser.py", line 1630, in parse_scc
    sec_scf.time_physical = val + time_initial
  File "/home/stefan/NOMAD/.pyenv/lib/python3.9/site-packages/pint/quantity.py", line 1078, in __add__
    return self._add_sub(other, operator.add)
  File "/home/stefan/NOMAD/.pyenv/lib/python3.9/site-packages/pint/quantity.py", line 115, in wrapped
    return f(self, *args, **kwargs)
  File "/home/stefan/NOMAD/.pyenv/lib/python3.9/site-packages/pint/quantity.py", line 984, in _add_sub
    raise DimensionalityError(self._units, "dimensionless")
pint.errors.DimensionalityError: Cannot convert from 'second' to 'dimensionless'

I tracked this down to time_initial first being initialized correctly, but while parsing the second SCF iteration, time_initial is initialized to None, the reason being that sec_scc.scf_iteration does exist but sec_scc.scf_iteration[-1].time_physical is empty. https://github.com/nomad-coe/electronic-parsers/blob/e7d1ffe0615d5eb68955705cbed5809b9dca71d1/electronicparsers/cp2k/parser.py#L1582-L1584 This propagates until line 1630/1631, where a value in seconds is added to None, causing the parser to crash: https://github.com/nomad-coe/electronic-parsers/blob/e7d1ffe0615d5eb68955705cbed5809b9dca71d1/electronicparsers/cp2k/parser.py#L1629-L1631

Complete input and output which crashes the parser: H2O_regprint.inp.txt H2O_regprint.out.txt

Input and output which processes smoothly for comparison: H2O_regprint_nokerker.inp.txt H2O_regprint_nokerker.out.txt (strip the .txt extension for testing) (that the final energies with Kerker mixing and with default mixing differ significantly from each other is a different story...)

The root cause for the issue is probably that in the defunct case, the second and the third column of the SCF iteration output are merged together so that the parser thinks that there is one column less.

Proposed fix: Make the parser somehow immune against columns that touch each other directly. Easier said than done, i know :-)

NOMAD version: 1.2.2.dev465+gc6aff391 in standalone parser development mode Electronic parser: latest from Github CP2K version: 2024.1

JosePizarro3 commented 1 month ago

Hi there @behnle,

Thank you very much for this report and #227 (I will answer there separately. We will take a look, indeed, it seems that the touching columns might be the problem.

I just wanted to let you know that we are also in the process of transferring parsers to individual repositories, as part of being individual plugins. This will happen during the summer.

I think we could: transfer the CP2K parser to its individual repo-plugin, and also fix there your issues. Is this OK for you or are you in some deadline for which you need the parser fix as soon as possible?

behnle commented 1 month ago

Hi @JosePizarro3, thanks for the quick response. No, i personally do not have any deadline. It was one of my users who ran into this, i am just responsible for operating our NOMAD OASIS and giving user support. For me it is not super urgent, OTOH, if this could be included in the NOMAD release container sooner or later, i would not complain.

JosePizarro3 commented 1 month ago

Cool, then I will put CP2K as one of the first parsers to transfer to its own repo plugin 👍🏻

As you operate the OASIS, one of the good things about this new plugin mechanism is that you can directly "plug in" the specific parser into your oasis without having to wait for a container release.