Integrate more python tools

zhucaoxiang commented 4 years ago

I have made some changes to some python tools. Personally, I would like to integrate all functions into one class, such that it is convenient to use and portable.

In spec.py from the pythontools branch, I have implemented/moved some plotting functions for SPEC and put them into the SPEC class （Each function is documented).

Here is a way to use it.

from spec import SPEC
# you can also do something like '%run spec.py

test = SPEC('spec_output.sp.h5')
# plot pressure profile
plt.figure()
test.plot_pressure(normalize=False)
# plot KAM surfaces (using FourSurf)
plt.figure()
test.plot_kam_surface()
# poincare plot
test.plot_poincare()

This is just to motivate some discussions on working together with more SPEC python functions.

jonathanschilling commented 4 years ago

@zhucaoxiang Nice work. During my time at PPPL, @KseniaAleynikova , @smiet and I put together the proc_spec.py and some stand-alone plotting routines. I guess this is where some of the code comes from? I see that having a single Python class for all SPEC-related code is nice, but I would not like to miss the functionality of the stand-alone programs like plot_SPEC_poincare etc. for quick-and-dirty inspections of the results from the command line. What is your take on this?

smiet commented 4 years ago

I think we should indeed have some stand-alone plotting routines, that can directly be run on the output for the quick inspection. The code of @zhucaoxiang currently uses different reading routines than the py_spec reading routine, but to prevent code duplication I think it is best to adapt it to use the same read in routine. I will look into adapting these routines to use py_spec. If py_spec needs to be adapted, I will add to it.

I have asked the Gekyll developers how they have implemented their python postprocessing. They have multiple classes: one class object which reads in and contains all the raw data from the hdf5 file. Then a second class which takes as input the raw data class, and has as in-built functions the things you want to do. That would look like this:

data = py_spec(datafile.h5)
plot_obj = proc_spec.plot(data)
fig, ax = plot_obj.plot_kam_surface()

Separating the data class from the class that manipulates this data and makes the plots helps keep the code clear and separate. Since plotting of cylindrical, cartesian and toroidal simulations is different, we might need to make three different classes; proc_spec.plot_cyl, proc_spec.plot_tor, ..., as otherwise the functions will need all sorts of branching logic to correctly manipulate the raw data into something plottable. Other classes that act on the raw data can this way also be added (like an eval class, that has functions that return variables at requested locations).

Thoughts?

zhucaoxiang commented 4 years ago

Personally, I would prefer the objective-oriented style, which is to put SPEC related functions into one class. In this sense, one doesn't have to specify the output for each process. Of course, there must be some functions that should be stand-alone, like compare_spec_output.py.

As to the different classes proc_spec.plot_cyl, proc_spec.plot_tor, etc., I don't think this is a good idea. It will be clumsy and confusing. Internally, the SPEC class should handle all the cases automatically and we should have a uniform API (might be flexible with different values of some argument).

Another point I prefer is to use keyword arguments for plotting, which will gain the convenience a lot. https://github.com/PrincetonUniversity/SPEC/blob/pythontools/Utilities/pythontools/py_spec/spec.py#L190-L233

smiet commented 4 years ago

@zhucaoxiang I completely agree with using the object-oriented style, and with using keyworded arguments as much as possible!

I think that putting all the functions into the base class can make it large and unwieldy, but I am not completely against it. We could make subclasses like in numpy, where you have numpy.random.{all random functions} and numpy.linalg.{all linalg functions}. We could have SPEC.plot.{}. The difference with this is that numpy itself doesn't contain any data and the functions in it are not acting on something stored in numpy, but always given arguments, whereas the plotting functions are acting on data in the class they are part of.

My proposal is to separate the functions from the data they are acting on, but still do it in an object-oriented way, giving the user an object of a plot_spec class. I agree that the `proc_spec.plot_cyl, proc_spec.plot_tor, etc.' is not ideal, and we should just have one type of class.

Also @zhucaoxiang, I see your spec.py uses a surface library containing the function FourSurf. could you add that to the repository?

PrincetonUniversity / SPEC

Integrate more python tools #93