Closed LonelyCat124 closed 1 year ago
I think the required implementation is now in the code.
Remaining things I need is:
pi()
or something else).source_count
from inside the function somehow.For random number then we need something in get_extra_symbols
to add a new structure if random_number is found, and to return generator, which needs to be of type auto
and set to an initial value somewhere/how, the setting up and removal of generator is more difficult.
Created a new AutoSymbol
for creating auto
style C++ types. This I can use for the generator maybe.
Support for random number is "done". Testing etc. still required.
add_include
already exist and works for 2) anyway.
Ok, a basic implementation now works.
I think ideally we don't want to do MPI communication for deletion flags but thats is not yet implemented.
Patch coverage: 100.00
% and no project coverage change.
Comparison is base (
ea7012d
) 100.00% compared to head (870c520
) 100.00%.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.
Source boundaries are now just done by rank 0 for now if MPI is enabled.
I created a very naive test for this and current issues:
core_part_slice
is being generated instead of core_part_velocity_slice
I think.The rest all seem related to that issue.
I'm going to add another "particle_core_reference" node to Particle_IR to handle elements which are part of the core particle structure, but are not the position. The way these are described in the code differs from how general particle accesses work, so they're not currently handled correct by codegen.
Ideally need a way to write to a variables/config member from a kernel. This is a bit tricky, I think it again needs a special node and a bit of analysis. It might be we need to create a manual kernel for now for the UKAEA use case.
Alternatively an "add writeable array" is doable, and write to it with atomic add maybe works.
Added writable arrays options, need to add atomic add functionality perhaps or some other way to handle it.
Remaining problem with that code for compilation (pre-worrying about atomic or reduction accesses, which may have to be manual for now, but we'll see) is that the array is now printed in the symbol table for some reason, need to check why and remove it, and also array accesses to these arrays should be with ()
operator and not []
operator.
Ok, it now uses the () operator for a view instead. Some of this code is not super nice but it works for now.
The one remaining issue I see is that we need to specify where the lambda's that update the extra arrays are executed (to execute on the host and not the device). TBC how to do this.
Probably need a global particle count accessible through the config?
Setting of nparts is wrong.
Can't reuse the generator so we need to do something better than that in codegen when using random numbers
Missing a fence after sink boundary call, and also after source boundary call. Sink boundary not flagging for deletion seemingly.
Small bug in the code that generated the deletion section, should check for == 0 instead of <= 0 I think. Also probably inserting 100k is enough to test for now as it reaches a steady state at around 1M parts - maybe 200k is better but not 1M.
Still not working with GPU though.
Velocities are still 0...
Not to do with randomness, even if i set the values to a constant its not working correctly.
Kernels just don't work for GPU this time?!
Shit the GPU kernels just don't work :(
Edit: The old manual implemetnation kernels DO do some updates so I'm a bit confused here, need to check whats going on.
Super stupid, super easy fix. Don't forgot the KOKKOS_INLINE_FUNCTION
before operators...else it breaks with CUDA :)
Still an issue in the magnetic field not loading for the PIC case. Not deep copying the data into the fields...oops.
Thats now fixed, but no output values still? Other than Bz.
Ok, this is pretty much ready to be merged in its current state i think. Tested fully, and runs on machines - performance of sink is bad but thats hard to resolve atm.
This PR contains the draft implementation for source and sink boundaries, as needed for the miniapp proposed by UKAEA.
So far this contains just a kernel definition, and the PIR nodes to represent them in the tree. Next is to add the tree to PIR converted, and then add support for them in Cabana_PIR backend.
Miniapps showing example implementions with Cabana are done, so hopefully it shouldn't be too complex to do this.