awslabs / palace

3D finite element solver for computational electromagnetics
https://awslabs.github.io/palace/dev
Apache License 2.0
241 stars 50 forks source link

Dry run does not detect duplicate entries in config #271

Closed parrangoiz closed 2 months ago

parrangoiz commented 3 months ago

When duplicate entries exist in a config file (e.g. config["Boundaries"]["Impedance"] appears twice), the JSON schema validator does not seem to detect this, resulting in a later error in Palace, for example boundaries with missing boundary conditions. This should be fixable with minor changes to the schema.

Reproduction: run spheres capacitance matrix example in /examples/spheres but break up the two terminals in the config into two entries:

"Terminal":
[
  {
    "Index": 1,
    "Attributes": [3]  // Sphere A
  }
],
"Terminal": [
  {
    "Index": 2,
    "Attributes": [4]  // Sphere B
  }
]

The JSON validation script runs without error, but the electrostatic solution is only computed for one boundary instead of two.

hughcars commented 3 months ago

I believe with JSON schema alone this isn't possible: https://stackoverflow.com/questions/26259902/writing-json-schema-to-detect-objects-with-duplicate-names but I think with some slightly advanced usage of the json parser we have in Palace we should be able to combine those two into one Terminal entry that combines both. So that config file would become a valid one (if confusingly written) https://github.com/nlohmann/json/discussions/3392 https://json.nlohmann.me/features/parsing/parser_callbacks/

sebastiangrimberg commented 3 months ago

Agreeing with Hugh here, and would add philosophically that the JSON Schema validation is meant to validate the config file according to the internal JSON parser, which will handle repeated entries as most do (just replaces the field with the last parsed value). So in that sense, the Schema and parser are consistent and I would not call this a bug.

The fact that the error in the config file can lead to errors in Palace later on because of a misconfigured model is kind of separate from the config validation (indeed, a config file with invalid attribute numbers will be "correct" according to the Schema but will fail later on). If you do want to solve this issue it might be complicated to do with both the JSON Schema and JSON parser, and might be easier to tack on as a separate validation step for correctness.

parrangoiz commented 3 months ago

That's fair, this isn't really a bug I guess. But regardless, I do think it's important to address this as it can lead to silent "errors" such as the one in my reproduction. (I say "errors" because yes, technically this is a user error.) So sounds like we have at least two options:

  1. Merge entries with duplicate keys as @hughcars suggests, or
  2. Throw a warning saying repeated entries were found and only the last one will be used

I'm not familiar with this JSON library -- which of these two seems easier to implement? I'd actually be ok if we just stick to (2) for now and tackle (1) later, assuming that's the kind of behavior we'd want to have.

hughcars commented 3 months ago

I suspect that 2 is probably going to be marginally easier than 1, once the docs for the json package have been read closely enough. Reason being, to do 2 I think you'd need to do a callback to check for the existence of a key on each read, at which point appending to that key is just as easy as sending an error.

sebastiangrimberg commented 3 months ago

I think 2 makes the most sense (either warn or just throw an error that the file is invalid), and can be implemented with a callback I am at least somewhat sure, based on my reading of the link Hugh gave.

EDIT: Actually this is already implemented, see https://github.com/awslabs/palace/blob/main/palace/utils/iodata.cpp#L164. Probably it just isn't working correctly to catch duplicates at a certain level?

parrangoiz commented 2 months ago

Managed to track down the "bug" to an internal script that was preprocessing the JSON and silently deleting the duplicate keys. I have tested this thoroughly by now and can confirm that the code @sebastiangrimberg point at does indeed catch duplicate keys when present, at any depth.