Open s9105947 opened 2 years ago
From the description, it sounds like most of the issue is not the PICMI code itself, but uncertainty about what data is available in the class instances and how to access it. Instead of rewriting the PICMI classes, an alternate solution is to provide documentation for the implementers describing what attributes are available and what they are (including type information). This would be a set of guidelines to follow. For instance, regarding the add_species
method, the guidelines would state the the list of species set up by the user can be found in the Simulation class in the attribute species
, which is a Python list. It sounds like documentation like this would go a long way toward solving the issues you bring up.
I agree that there can be an issue with have some data stored in two different ways, e.g. the number of grid cells. I have long learned that there is only so much that can be done to limit the damage when users abuse code since it is impossible to predict what users will do. In this case though, perhaps a small clean up can be done so that only one of the two alternatives is saved internally, for example only storing number_of_cells
but not the individual nx
etc. The interface was written allowing both options to provide flexibility for users.
It sounds like documentation like this would go a long way toward solving the issues you bring up.
Yes. Currently there are two interface to PICMI:
Both are close, but subtly different. For implementations this would acceptable, but:
One of the high-regarded workflows for Smilei is referencing the input parameters when processing the output. In this case the user will use the same data structure as the implementing simulation, i.e. PICMI-out. Now the user writes PICMI-in, works with PICMI-out, and maybe doesn't even notice the difference until one of the subtle differences bites their back. Reducing PICMI to a set of arguments + auxiliary, convenient functions to modify these ensures that such differences are impossible to occur.
Note: This would also make serialization very simple, which would enable the process of reading PICMI during processing of output also when not working from a single python context.
Another short argument: The constructors are currently a lot of boilerplate code. (Admittedly, currently at an somewhat acceptable level.)
I believe that such a restructuring could be fully compatible to the current usage. If you're not opposed to the general idea I can hack together a fork for demonstration during the next week.
I have long learned that there is only so much that can be done to limit the damage when users abuse code since it is impossible to predict what users will do.
The README of PICMI states that "The goal of the standard is to propose a set (or dictionary) of names and definitions [...]". However, the documentation at no point specifies sets or dictionaries, it always specifies
__init__()
methods. Which variables are set by__init__()
is entirely unspecified (and has to be looked up in the code).For the most cases this does not matter, e.g. the attribute
name
of a Species is later accessible asspecies.name
. For other cases this is not as clear: What properties doesSimulation.add_species()
modify? In which way? I am vary of just reading the code, b/c there are no guarantees made by PICMI that this code will be stable.In addition to that the behavior of the
__init__()
method is not clear: Most only copy their parameters to the attributes, but some perform more operations, e.g.:Or put another way: PICMI currently specifies a user interfaces, but the object that is passed to the implementing PIC simulation is unspecified.
There are a couple of approaches to this, I want to propose these steps:
__init__()
,__eq__()
,__hash__()
, ...)add_species()
None
by default values where applicable etc.I have attached the species definition as an example how I would imagine the specification to look like & and the test case that sketches its behavior.
Implementation:
Test case:
Notably, this would be very close to how openPMD is specified, as purely a list of attributes. (Which would then allow separating the reference implementation and the specification more clearly, as mentioned in #3)