nomad-coe / electronic-parsers

Apache License 2.0
19 stars 7 forks source link

CP2K parser yields trajectory errors for finished trajectories #253

Open ndaelman-hu opened 2 months ago

ndaelman-hu commented 2 months ago

The CP2K output in the example set states of geometry optimization yields errors and warnings indicating an attempt to extract a non-existing frame. I suspect a bug in the frame counting.

Example set: https://nomad-lab.eu/prod/v1/develop/gui/user/uploads/upload/id/Arb2fKCQT_uPOUriPyWKFw

Logs:

"root":{
  "data":string"{'frame': 101}"
  "event":string"Error reading trajectory for the specific frame."
  "proc":string"Entry"
  "process":string"process_entry"
  "process_worker_id":string"ALgrZWZwSYSaMFc_Zobd4g"
  "parser":string"parsers/cp2k"
  "step":string"parsers/cp2k"
  "logger":string"nomad.processing"
  "timestamp":string"2024-09-07 13:29.50"
  "level":string"ERROR"
}
"root":{
  "data":string"{'frame': 101}"
  "event":string"Could not parse system information for the last frame. We will attempt to parse the system information from (frame + 1)."
  "proc":string"Entry"
  "process":string"process_entry"
  "process_worker_id":string"ALgrZWZwSYSaMFc_Zobd4g"
  "parser":string"parsers/cp2k"
  "step":string"parsers/cp2k"
  "logger":string"nomad.processing"
  "timestamp":string"2024-09-07 13:29.50"
  "level":string"WARNING"
}
  "root":{
  "data":string"{'frame': 102}"
  "event":string"Error reading trajectory for the specific frame."
  "proc":string"Entry"
  "process":string"process_entry"
  "process_worker_id":string"ALgrZWZwSYSaMFc_Zobd4g"
  "parser":string"parsers/cp2k"
  "step":string"parsers/cp2k"
  "logger":string"nomad.processing"
  "timestamp":string"2024-09-07 13:29.50"
  "level":string"ERROR"
}
  "root":{
  "normalizer":string"MetainfoNormalizer"
  "event":string"Energy not reported for an calculation that is part of a geometry optimization"
  "proc":string"Entry"
  "process":string"process_entry"
  "process_worker_id":string"ALgrZWZwSYSaMFc_Zobd4g"
  "parser":string"parsers/cp2k"
  "step":string"MetainfoNormalizer"
  "logger":string"nomad.processing"
  "timestamp":string"2024-09-07 13:29.52"
  "level":string"WARNING"
}
"root":{
  "event":string"processing error"
  "level":string"ERROR"
  "timestamp":string"2024-09-07T13:29:41.219000+00:00"
  "processing_errors":[
    0:
    string"Error reading trajectory for the specific frame."
    1:
    string"Error reading trajectory for the specific frame."
  ]
}
ndaelman-hu commented 2 months ago

A 1st, short debug attempt shows the following structure:

My impression is that the index is not properly updated. The starting (and ending) frame receive special treatment, such as not being counted by the index. The index in geom. opt. seems to include the regular index, though, causing an IndexError.

ndaelman-hu commented 2 months ago

A 1st, short debug attempt shows the following structure:

* `CP2KParser.parse_calculations` enumerates over all extracted `calculations`, where `frame` is the running index.

* `CP2KParser.parse_system` calls `TrajParser.get_trajectory(frame)`, where `frame` is reinterpreted as a trajectory object.

* `TrajParser.get_trajectory` simply regulates the extraction. Any kind of failure (triggered in this case) is handled by CP2KParser.parse_system`.

My impression is that the index is not properly updated. The starting (and ending) frame receive special treatment, such as not being counted by the index. The index in geom. opt. seems to include the regular index, though, causing an IndexError.

@ladinesa or @aalbino2 could any of you, as the og patcher and a code expert respectively, verify my understanding of the indexation here?

aalbino2 commented 2 months ago

Hi @ndaelman-hu,

I have no access to the upload you posted above. From the geo_opt point of view you can take a look at this published upload: https://nomad-lab.eu/prod/v1/develop/gui/search/entries/entry/id/awW9DHiMBwa4uz8eD7hbItgrxoOJ

It show a log error saying the the .out and the trajectory file have different steps. Two main issues are stack here:

I don't know if those bugs are already fixed in develop. Let me try to reprocess from there, as you suggested to Pierre-Adrienne