Closed LonelyCat124 closed 1 year ago
Base: 100.00% // Head: 99.97% // Decreases project coverage by -0.03%
:warning:
Coverage data is based on head (
097954d
) compared to base (17990f1
). Patch coverage: 99.94% of modified lines in pull request are covered.:exclamation: Current head 097954d differs from pull request most recent head e99e600. Consider uploading reports for the commit e99e600 to get more accurate results
:umbrella: View full report at Codecov.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.
A few lines of missing coverage, will fix up next, then start building AST -> tree.
So far, I know I need to implement:
Break Node subclass to store break. Return Node subclass to store return. ScalarType.Intrinsic.CHARACTER to store strings.
Still not implemented:
Next up is to create kernel nodes from python AST.
Still need to test the following for the PIR visitors for CabanaPIR:
Once that is all working correctly, we can implement the main Cabana PIR backend and test that.
We also haven't yet test reset_type_mapping_str
in datatype.py.
The base pir_visitor is also missing some testing:
__call__
fail condition._visit
fail condition._visit_emptystatement_node
ArraySymbol variable declaration is not quite correct, it currently does int[3][3] f;
which I'm fairly sure isn't valid.
Invoke arguments don't resolve as there is not appropriate "Name" when looked up. Need to consider how to resolve this. One option is to add a "functionReference" created by specifically invokes. The alternative might be to make some fake Reference to a missing symbol, but I'm not sure if that would have unintended side-effects or not...
Ok, Invokes were always intended to take literals. This is now fixed and works as originally designed
Next up is implementing the main Cabana pir backend object, once thats done we can worry about testing PIR visitor with structures and testing the cabana pir backend
One thing I'd like to be able to support but can't currently work out how to do is C++ style calls from namespaces, e.g.: Kokkos::parallel_for
or things like that. I don't think there is any immediate need to, the only thing thats a bit weird right now is auto
as a type, but we just pretend its a structure type and its probably fine. I think otherwise, the interpolate and gather calls required in FDTD can actually be implemented in python code and just generated as C++, which would be much nicer.
As part of supporting MPI, we need to add a parallel HDF5 I/O module - for now we will do a spatial breakdown of the domain I suppose, and go from there.
One thing I need to work out is how to get field inputs from a HDF5 file to match the I/O used from the miniapp, but I think that can be code that belongs to the "grid coupler"
I would also like the grid coupler to use python code that generates the real function, but its not so obvious how to go about doing that right now.
Implemented enough of the backend to consider testing it out now.
Two immediate challenges:
part
and config
types, which are managed by the DSL frontend (instead of manually making the Particle()
and Config()
objects as previously implemented for non_PIR backends.config
type, which means when passing the config
type into the DSL it just doesn't know what it is. It should be added/sorted similar to the part
type.Basic implementation of those is implemented now in the DSL main file. However currently its not finished as not all the base components are included (need to check what some of them are by default, if they are defaults and missing).
The main function is now output hurray!
Unfortunately:
[x] No kernel is output
[ ] Seemingly the module functions are not correctly called as the output code contains initialise();
and cleanup();
which are DSL calls not language calls.
Ok, so it works for a simple example now...there's more to do for MPI/coupled systems probably, but those are next on the agenda.
Structures I've still not checked support for - this is TBD.
The codegen for members of core_part and neighbour_part right now is bad and fixed, this should be improved.
Remaining tests:
cabana_pir:
pir_to_cabana_visitor:
MPI additions have been started. The gen_mpi_headers
section is mostly completed, but I don't immediately have a good way to port the optimisation from 1D to general dimensions right now.
I think for the general 3D case there is no good way to do this with Cabana atm since we might move far from one boundary while still being too close to others to make a reasonable rule.
You could potentially check that all 3 dimensions are too far from a boundary, but i'm not sure how that then works for lower dimensionalities. I will make a corresponding issue to deal with this/move to Cabana::migrate.
Ok, struggling to keep track of what needs implementing and what is implemented for MPI.
get/set_mpi
status in the DSLMigrator
code to do the MPI communicationMigrator
object and initialise it and do the initial domain decomposition in initialise
or as part of I/O.I sat down a bit and tried to write down a bit more detail on what needs to be in the Cabana_PIR backend for MPI. The optimisation for MPI relies on sorting the particles into bins, which I don't think for now will be part of the default approach to particles. I will make a separate issue to add binning/sorting of particles.
Code cleanup according to pylint is done for the Particle_IR
nodes now.
It looks like Cabana's Cabana::LinkedCellList
has a set of functions that should enable looping only over the domain edge, which means we can improve the optimisation used in the migrator, which would enable it to scale from 1D only to 3D without so much complexity.
I implemented the any dimensional edge walk strategy for the migrator, which means optimised in any dimension is doable and the performance is pretty similar which I'm happy with.
I'm currently trying to decide how to get the box size to be read in, and I think the answer has to simply be that it has to be provided by the IO system - which to be fair is how the miniapp now works anyway.
Ok, so the PHDF5 module (which is the only one I will support for the Cabana_PIR_backend) now implements 3 functions in its gen_code
.
The first is get_box_size
which just gets the box size for the box structure. The other 2 are the input and output options - the input requires loading of positions to function.
It currently inherits from Cabana_IO_Mixin
but I think Cabana_PIR
needs its own as it is dramatically different in structure from the default Cabana one.
Next job is to create the MPI-parallelised grid solver code. This needs to do: 1) Grid solving (duh) 2) Have a preferred domain decomposition 3) File IO - in this case through Parallel HDF5.
A lot of the c++ code for the parallel coupled system is now implemented. I need to still do the IO (including reading in instead of static initialisation) with MPI for HDF5, but I think thats fairly sensible, with some complications to do with limiting the input and offsets etc.
Then we just need to make the python module to use that c++ code correctly, and we should be good to go.
Again this is currently 1D only, but I hope it won't be too difficult to add dimensionality later if the problems require it, though physicist help would be useful for that.
Ok, so I think the HDF5 code to intialise the fields is now written. I had to store yet more info in the field object that will go inside the config, however I expect this shouldn't cause too many issues.
The one slight complication is that the mpi partitioning information must be got from host_config.field.XXXX
somehow, so will need to be careful with that.
Whatever corrections were done to the particle interpolation need also to be ported to the DSL next.
All is done clearly except for the preferred domain decomposition stuff, which needs to be in base coupled system and implemented if required by a coupled system.
Whatever corrections were done to the particle interpolation need also to be ported to the DSL next.
This is also not done yet.
The consideration I need to make is where in the PIR the decomposition actually happens, so the order of operations needs to be something like:
Ok - I think initialise now does all these things in the expected order. The thing that isn't done for now is if the coupled system doesn't do a decomposition it just throws an Error for now.
For now I'm assuming that coupled systems that don't have a preferred decomposition will just obtain it from the boundary structure.
Corrections TBD.
Also to be done: Check when particles are moved MPI communication happens if needed - need a rank computation call somewhere (which currently isn't done) - can be inside the MPI communication code potentially but better if not.
Testing running a simple example now, and a current fault is the boundary condition is added and converted to PIR before things are added to the config member, which is messing things up.
This might happen with a lot of things right now, so I need to think about altering/delaying certain checks to happen later, or converting to PIR as late as possible.
I'm not sure which approach I prefer, having the checks during AST->PIR is nice, but if registering kernels causes issues its not great, but equally having all errors during codegen can result in doing some stuff before errors.
I'll think.
We might need an opaque type in the PIR to support things like field_type
that is created from a using
statement (or even to support a type like Kokkos::view<double*, MemorySpace>
or similar).
Right now the boundary condition is trying to access "boxdims" so need to fix that.
Ok, so I fixed the issue with the boundary access by adding that into the base config type.
The current challenge is that the coupled systems aren't adding variables now (because they can't at that stage, they're not called) but the AST does the check at PIR construction time. I think the only plan I can do here is to disable checks inside Call
construction for now. The calls often have weird stuff anyway, but maybe they need to be strings now anyway? I'm not sure.
The coupled systems not adding variables is a major problem. I need to think about how to do this as its not as simple as the old one (which checked variable accesses much later).
Perhaps the key thing is to reorganise symbol lookups to occur at codegen time instead of IR generation time.
Edit: I guess the problem with that approach is what to do with a symbol that doesn't yet exist when generating IR.
I think probably the best option is that backends that support PIR have a "get_symbols" call, which takes a list of strings of function call names, and returns a list of symbols to be added to the current function.
Ok, I think that implemented correctly. Next up is global variable searchignw hcih I've skipped implementing so far.
The global variable search works, the remaining thing that is not found is access to declared structures inside Kernels.
This is a bit more complex, since they can't just be added to the symbol table (as they belong to the functor and are thus visible anyway).
The symbol table has a get_symbols
function which returns a new dict of all of the symbols in a symbol table - can we have a "structure_symbols" member as well which contains all of the structures the backend makes available somehow? These can be in the symbol table but not returned by get_symbols
, and thus are not generated by backend codegen inside the kernel itself?
Codegen finally happened successfully.
I will note that there are some very clear issues that I can tell without even looking at the actualy code.
_name
but the symbols created for them are just name
so there needs to be a decision on how to do this (I think just name
and drop the _
for them).call_interpolate_to_particles
is definitely not being called correctly.gather_forcers_to_grid
is also not being called correctly.initialise
is not being called correctly, so whatever functionality is meant to deal with function calls is broken.The periodic boundaries are correctly being done after the particle push call though. More comments in generated code to explain what its doing and why would probably be good!
Documentation, testing etc. is all missing still, but first I need to get this codegen done as expected.
Not sure what's happening to call_interpolate_to_particles
at the moment, it does look for it in a coupled system but the output is clearly breaking but not with an attribute error. In that case I was expected the program to break but instead it is soldiering on, so I need to check the call in the pir visitor I guess.
Code looking a lot better - still some indentation issue that occur with Call nodes from PIR when a call from the parent returns a string.
I think accessing the config now also needs to be its own special node type as well, as accessing the config is not generating correct code at the moment.
Few other clear errors in codegen at the moment:
load_grid
call adds extra "
around the filename.store_domain_decomposition
is accessing the config wrong for the Cabana_PIR backend.get_box_size
is accessing the config wrong for the Cabana_PIR_backendwrite_output
should do the coupled system file output as well, not rely on the user to do this I think.config reference class implemented but not tested or anything yet. It also fixes the issue of how the config should be accessed in kernels (i.e. with a _
in front of it. The rest of the previous comment is NYI, though this may fix some of it.
Ok, testing the MPI implementation a bit, one issue right now is that array members don't work, e.g. part_p
is a double[3]
in the PIC example, and that fails at lines ~594 because double[3]
is not a supported type. I don't think in principle anything stops me doing arrays (and maybe they're already done for positions) so I need to think about how to do this.
Some of the code to generate arrays seems to be present, but not the bit that analyses the datatype to check. I think what needs to happen is to simply ignore anything in a datatype that occurs after a [
and hopefully it should work?
Currently no creation of a Particle IR tree but the node implementations are done so far.