nomad-coe / electronic-parsers

Apache License 2.0
18 stars 7 forks source link

CP2K parser crashes when printlevel was reduced #227

Open behnle opened 1 month ago

behnle commented 1 month ago

The CP2K parser crashed when calculations were run with

&GLOBAL
  PRINT_LEVEL LOW
&END GLOBAL

This causes the complete Quickstep output block (lines starting with QS|) to be omitted. The parser then fails to read the plane wave cutoff from the output file and subsequently crashes:

(.pyenv) stefan@localhost:~/LISAPLUS/NOMAD$ nomad parse /home/stefan/sampledata/CP2K/cp2k/H2O.out --show-archive
WARNING  nomad.datamodel      2024-06-04T13:17:25 Schema is deprecated, use plugins.
  - nomad.commit: 
  - nomad.deployment: devel
  - nomad.service: cli
  - nomad.version: 1.2.2.dev465+gc6aff391
ERROR    nomad.client         2024-06-04T13:17:30 parsing was not successful
  - exception: Traceback (most recent call last):
      File "/home/stefan/NOMAD/.pyenv/lib/python3.9/site-packages/nomad/parsing/parsers.py", line 215, in run_parser
        parser.parse(mainfile_path, entry_archive, logger=logger, **kwargs)
      File "/home/stefan/NOMAD/.pyenv/lib/python3.9/site-packages/nomad/parsing/parser.py", line 460, in parse
        self.mainfile_parser.parse(mainfile, archive, logger)
      File "/home/stefan/NOMAD/.pyenv/lib/python3.9/site-packages/electronicparsers/cp2k/parser.py", line 2300, in parse
        self.parse_method_quickstep()
      File "/home/stefan/NOMAD/.pyenv/lib/python3.9/site-packages/electronicparsers/cp2k/parser.py", line 2039, in parse_method_quickstep
        basis_set=self._parse_basis_set(),
      File "/home/stefan/NOMAD/.pyenv/lib/python3.9/site-packages/electronicparsers/cp2k/parser.py", line 2024, in _parse_basis_set
        cutoff=self.settings.get('qs', {}).get('planewave_cutoff', None)
      File "/home/stefan/NOMAD/.pyenv/lib/python3.9/site-packages/pint/unit.py", line 187, in __mul__
        return self._REGISTRY.Quantity(1, self._units) * other
    TypeError: unsupported operand type(s) for *: 'Quantity' and 'NoneType'
  - exception_hash: rc-YL5xWeMqCMqQKzi4-XMIs2eGx
  - nomad.client.parser: parsers/cp2k
  - nomad.commit: 
  - nomad.deployment: devel
  - nomad.service: cli
  - nomad.version: 1.2.2.dev465+gc6aff391
Traceback (most recent call last):
  File "/home/stefan/NOMAD/.pyenv/bin/nomad", line 8, in <module>
    sys.exit(run_cli())
  File "/home/stefan/LISAPLUS/NOMAD/.pyenv/lib/python3.9/site-packages/nomad/cli/cli.py", line 75, in run_cli
    return cli(obj=POPO())  # pylint: disable=E1120,E1123
  File "/home/stefan/NOMAD/.pyenv/lib/python3.9/site-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
  File "/home/stefan/NOMAD/.pyenv/lib/python3.9/site-packages/click/core.py", line 1078, in main
    rv = self.invoke(ctx)
  File "/home/stefan/NOMAD/.pyenv/lib/python3.9/site-packages/click/core.py", line 1688, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/stefan/NOMAD/.pyenv/lib/python3.9/site-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/stefan/NOMAD/.pyenv/lib/python3.9/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
  File "/home/stefan/NOMAD/.pyenv/lib/python3.9/site-packages/nomad/cli/parse.py", line 83, in _parse
    entry_archives = parse(mainfile, **kwargs)
  File "/home/stefan/NOMAD/.pyenv/lib/python3.9/site-packages/nomad/client/processing.py", line 66, in parse
    entry_archives = parsers.run_parser(
  File "/home/stefan/NOMAD/.pyenv/lib/python3.9/site-packages/nomad/parsing/parsers.py", line 219, in run_parser
    raise e
  File "/home/stefan/NOMAD/.pyenv/lib/python3.9/site-packages/nomad/parsing/parsers.py", line 215, in run_parser
    parser.parse(mainfile_path, entry_archive, logger=logger, **kwargs)
  File "/home/stefan/NOMAD/.pyenv/lib/python3.9/site-packages/nomad/parsing/parser.py", line 460, in parse
    self.mainfile_parser.parse(mainfile, archive, logger)
  File "/home/stefan/NOMAD/.pyenv/lib/python3.9/site-packages/electronicparsers/cp2k/parser.py", line 2300, in parse
    self.parse_method_quickstep()
  File "/home/stefan/NOMAD/.pyenv/lib/python3.9/site-packages/electronicparsers/cp2k/parser.py", line 2039, in parse_method_quickstep
    basis_set=self._parse_basis_set(),
  File "/home/stefan/NOMAD/.pyenv/lib/python3.9/site-packages/electronicparsers/cp2k/parser.py", line 2024, in _parse_basis_set
    cutoff=self.settings.get('qs', {}).get('planewave_cutoff', None)
  File "/home/stefan/NOMAD/.pyenv/lib/python3.9/site-packages/pint/unit.py", line 187, in __mul__
    return self._REGISTRY.Quantity(1, self._units) * other
TypeError: unsupported operand type(s) for *: 'Quantity' and 'NoneType'

Relevant source code: https://github.com/nomad-coe/electronic-parsers/blob/e7d1ffe0615d5eb68955705cbed5809b9dca71d1/electronicparsers/cp2k/parser.py#L2022-L2026 Getting planewave_cutoff fails, and the attempt to convert the empty result to hartree crashes the parser. Proposed fix: Enclose in Try/Except block, as this is a perfectly fine CP2K calculation which should not crash the parser.

CP2K version: 2024.1, but problem is also present with CP2K 9.1 nomad.version: 1.2.2.dev465+gc6aff391 NOMAD parser version: current github repo

Attachments: complete input and output files for reproducing the issue (strip the .txt extension!, This was added for Github to accept the upload). H2O.inp.txt H2O.out.txt

JosePizarro3 commented 1 month ago

Hi again! This is an easy fix, but please, check my comment in #228 first (I think we can solve these in one go when transferring the CP2K to its own repository).

And thanks again for delving in the parser! This feedback is very much appreciated 🙂

In this case, the issue is that cutoff in BasisSet() has some units assigned, but when it fails to get from self.settings and returns None, this has some conflict with the pint units (i.e., with the ureg.hartree multiplication). There are two potential solutions:

  1. Set a default value to planewave_cutoff instead of None. I will check the documentation for this, but if you are aware already, let us know :-)
  2. Add an if self.settings.get('qs', {}).get('planewave_cutoff') is not None: before instantiating bs_pw = BasisSet(...)
behnle commented 1 month ago

Thanks for the answer. I personally would strongly advocate for solution 2. If the value is not present in the input or output file, it should not occur in the parsed data. I am extremely wary to hard-code program defaults in a third-party parser, as such parameters may simply change over time and program version, and all of a sudden, your parsed data is simply wrong, or you have to keep track of potentially hundreds of parameters for every version of a software package.

Of course there has to be some default value for the planewave cutoff, it is an essential ingredient of every plane wave calculation, but in case of CP2K, i don't know whether it is a constant or whether it is calculated from system parameters (cell size, k points, involved elements e.g.)

JosePizarro3 commented 1 month ago

Great, we go with the if then. Thanks for the detailed answer, that was very helpful!

mm-may commented 1 month ago

Hi, thanks for looking at this issue so quickly!

Yes, there is indeed a default cutoff value of 280, see https://manual.cp2k.org/trunk/CP2K_INPUT/FORCE_EVAL/DFT/MGRID.html#CP2K_INPUT.FORCE_EVAL.DFT.MGRID.MULTIGRID_CUTOFF

It would indeed be great if the input file is also checked before setting the default value.

JosePizarro3 commented 1 month ago

Great, thanks @mm-may, I have to definitely check the documentation, we can go with option 1 anyways and set the default value as you suggest.

It would indeed be great if the input file is also checked before setting the default value.

This is a bit more complicated. We can do it, but there is always the risk that input files are being modified by users and calculations are not referring to the actual input file being uploaded; this is one of the reasons we like to keep out input files as much as possible out of parsing.