Semantic & Syntactic Checking

s9105947 commented 2 years ago

PICMI objects do not validate itself. I propose adding methods to each object along the lines of:

class UniformDistribution:
    def check(self):
        """checks this object, passes silently if ok, else raises"""
        assert self.density > 0, "density must be non-zero positive number"

This method is mainly intended to ensure the integrity of an object, and could be called in the __init__() method automatically or (additionally) manually by an implementation/the user themselves.

Potentially it could also handle type checking, which would typically done by using properties instead of a dedicated method. See also #58.

Such a check would guarantee a certain structure towards implementations and lift them of the burden of having to re-check the given python objects for errors.

dpgrote commented 2 years ago

From my perspective, this kind of argument checking should not be done in PICMI. Presumably the implementing code already does checking of the input it is getting. Implementing this in PICMI would only duplicate these checks. When developing the WarpX implementation, I have intentionally avoided adding such checks since I know that WarpX already has many checks of the input. Many checks are more complicated than the example you give and it becomes unwieldy to duplicate the checks (and error prone when the code changes and the checks need to be updated).

s9105947 commented 2 years ago

Presumably the implementing code already does checking of the input it is getting. Implementing this in PICMI would only duplicate these checks.

This is correct to a certain degree, but I'd like to raise three points:

These are different checks: PICMI should check "Is this conforming to the PICMI definition" while an implementation should check "Can I make sense of this input?". The latter is more restrictive than the former.
The PICMI structure is user input and thereby must be validated. We'd all rewrite the same checks in our respective codes.
(most importantly) Fail Fast: I think we should follow the concept of failing as early as possible. While the linked article outlines a couple of advantages, here the one for users is most important: ValueError: ratio must not be zero (b/c value is not conforming to PICMI) is much clearer than a ZeroDivisionError: division by zero (b/c the implementation expected the input to be conforming to PICMI). We'd not only use a common input format, we'd also get common error diagnostics.

All this assumes that implementing PICMI is more than search-replacing the parameters in some input files -- for simple search-replace such thorough checking appears to be overkill.

Put into different terms: I will include checks of the PICMI structure (at least on a shallow level) in the PIConGPU PICMI implementation, b/c it is the prime use-case for "fail fast". Are these beneficial upstream (here)?

dpgrote commented 2 years ago

Here are my comments:

Yes, true. PICMI already does a fair amount of argument checking. In the Grid classes for example, it does checking to make sure that the input parameters are consistent and checks if the ones accepting sequences are the correct length. There is more checking of this type that can be done, for instance there are other places where lengths of sequences can be checked. But as you say, these types of check are making sure that the input conforms to PICMI. However, the example check you give, checking the density, would be in class of does this input make sense.
True, but all of the codes have presumably written all of the same "does this make sense" checks already since they take input independent of PICMI. There's no reason to write those checks yet one more time (especially since the checks might be subtly different among codes or complicated).
I agree that fail fast is a good thing, with good error messages. PICMI does already do a number of checks that will catch many problems and fail fast. I do agree that some more conformation checking can be done. For the other checks, the codes are presumably check things at the start of the simulation so it will still fail fast, and you would also get the presumably nice error messages from the codes.

Overall, I would comment that PICMI is not a very large code and will likely be very stable. I think it will be much more maintainable to have any checks written out explicitly rather than introducing a new code architecture.

ax3l commented 2 years ago

Just my 2 cents: I think super-classes that know the physical meaning of their parameters, such as the mentioned UniformDistribution, should add first checks.

An implementation can then call the superclass checks and add their own. I see no reason why every implementation needs to implement the same checks again for things we already know in PICMI. That does not mean PICMI needs to check all parameters, just what is reasonable (and the example is).

This would make PICMI more useful. This allows us also to write correctness checks on general PICMI input files without invoking (or even installing) a code. This is very useful in my opinion, e.g., when adding inputs files to data archives and thinking about science gateways. We want to do such checks easy and lightweight (which the plain picmi dependency is: pure python package) and independent of a concrete implementation (which can be heavy).

Also, adding such check interfaces automatically unifies how people implement input parameter checks downstream. That brings some order into implementations, too.

Action items I would propose:

Add check interfaces in PICMI.
- Checks can be run at the constructor level (that's just user friendly), but:
- most importantly they need to run immediately before we do the "code" (input files) generation in the "backend"
Populate them over time with reasonable parameter checks that are common to all codes.
Use those in unit tests of PICMI itself, e.g., validating example scripts we have in the repo (and in other repos).

picmi-standard / picmi

Semantic & Syntactic Checking #59