Open janssenhenning opened 3 years ago
Actually after working with bokeh plots a bit more I would be in favor to implement similar behaviour for matplotlib plots. So you can define a DataFrame (e.g with pandas) and give the keys you want to plot. With pandas this would be already possible to plot by calling the plotting methods directly on the dataframe but I think just giving the data and then indexing the right keys would be enough, since I think the interface is slightly different, which might be confusing
Of course we could construct a dataframe if it is not given and support all kinds of ways of giving the data in this way
This would probably also massively simplify exporting the plot data to files
Hi Henning, I'm not familiar with the masci-tools.vis
modules. This is just food for thought. And about thematic overlap of this issue and that issue (integration of the branch studentproject18w into the main code, as much as is sensible) (Disclaimer: I wrote that code.).
The goal for that project/branch was to provide an interactive bandstructure+DOS plotter from fleur HDF5 output files with two user frontends (Tkinter desktop program, a Jupyter dashboard), using the same base code.
The outcome was
(The frontends worked, the jupyter dashboard can still be tried out via the binder badge in the README.)
Now a little more detail how it works.
The preprocessor takes a JSON recipe, e.g. FleurBands, which specifies the datasets to extract from the HDF file, what transformations to apply to each, and the desired output type. The output type specifies functions for postprocessing data manipulation, e.g. for plotting. The reader then reads the datasets from the HDF file, transforms the datasets (dependencies between datasets for transformations are resolved automatically), creates an instance of the specified output type, and adds the transformed datasets as attributes of that instance. The attributes remain h5py datasets (ie, file-storage access), but can be 'moved to memory' optionally (changed into numpy arrays).
The Plotters (plotting classes) derive from an abstract class with an abstract data
attribute, of which the preprocessor's output types are subclasses. For example, the AbstractBandPlot's data attribute is of type FleurBandData. The Plotters' actual plotting methods' arguments then do not take data, but only data selection arguments which operate on the underlying data
attribute. This addresses at least partially your 'providing data' concern above.
(Side note: branch did not have pandas Dataframes in mind.)
(Side note: the problem with this approach is of course, that it relies on the whole pipeline, ie data comes only from HDF. But I think this can be relaxed.)
(Side note about the hierarchical Plotter classes concept: this can lead to a combinatorial explosion of classes, because you need to define a class for every use case and every plotting library. I don't know with which pythonic Design Pattern this problem could be solved more efficiently.)
The current way of providing data for plots in the plot_methods is through arrays or lists. The dimension checking is quite fragile and also varies from method to method (both on the
develop
andplot_methods_refactor
branch)We should have: