ascot4fusion / ascot5

ASCOT5 is a high-performance orbit-following code for fusion plasma physics and engineering
https://ascot4fusion.github.io/ascot5/
GNU Lesser General Public License v3.0
31 stars 9 forks source link

Major cleanup of a5py (and some plans for ASCOT5's future) #113

Open miekkasarki opened 4 months ago

miekkasarki commented 4 months ago

I already cleaned up the a5py package once for the 5.4 release, but there is a need for another round. Fear not though as this time the interface is not going to change much. Some functions might become deprecated but I try to put a deprecation warning on those and only remove them after some time has passed.

The major improvement on the first round was combining the HDF5 file interface with the interface to the libascot. After the IMASification of ASCOT5, it has now become possible to run complete Ascot simulations using just the a5py package and even storing data in HDF5 file is optional. Python is much more convenient when running simulations or accessing HDF5, so in the future this will be the only way that ASCOT5 is run (i.e. at some point ascot5_main will be removed and ASCOT5 will be used in a similar way as numpy for example).

Now for the second round we need to further separate HDF5 operations from the rest of the code while tying the C-interface more closer. The aim is to abstract the IO interface so that it does not matter whether the data is accessed via C-interface, IDS, or HDF5. Also the code has again experienced organic growth and improving the class structure is necessary to maintain maintainability.

The objective of the cleanup is to achieve the following structure for the base classes:

Ascot

AscotIO

DataContainer

ResultNode

For this we need several Mixin classes, as several methods are shared between Ascot and classes that inherit ResultNode. Furthermore, ascotpy requires much cleaning as much of the data access routines should be moved to ascot5io. Templates are bit of an oddball for which I haven't yet figured out a nice implementation. Test classes also require rewrite and tests should be analyzed with the coverage package. The overall code should be analyzed with a linter.

miekkasarki commented 3 months ago

Update on this one. The data access has now been completely abstracted from how the data is stored. The DataContainer objects are now called "variants" (because we have different categories of input and each category has different variants of input) and they define how the data is stored in HDF5 and in the offload array / C struct. The abstraction is done using the properties.

Also, I finally found a way to generate inputs that I'm happy with. Instead of single create_input method, the Ascot5IO object has several create_<input variant> methods. This way the user can easily access the documentation of that input without scrolling through the online docs.

So what we have now is:

a5 = Ascot("ascot.h5")

b2d = a5.data.create_B2DS(**parameters, store_hdf5=True)
b2d.psi # Reads psi from the disk

b2d = a5.data.create_B2DS(**parameters, store_hdf5=False)
b2d.psi # Copies the data from the offload array which is allocated on the C side

b2d.export() # Returns dictionary that is equivalent to "parameters" which can be
             # used to clone the input or modify it
miekkasarki commented 3 months ago

Along the way I found it useful to further modernize the code base. From now on the unit tests will be implemented with pytest and we enforce typehints. We will also need to include mypy and pylint runs to the workflows.