OpenDrift / trajan

Trajectory analysis package for simulated and observed trajectories
https://opendrift.github.io/trajan/
GNU General Public License v2.0
11 stars 5 forks source link

Add fastplot plot function #131

Open jerabaul29 opened 3 weeks ago

jerabaul29 commented 3 weeks ago

Using trajan I regularly feel like I wait quite a bit when plotting, and I miss a "fastplot" function, that does "simple and basic but fast" plotting without too many options to tune. The default plotting functions are very nice, but they can be slow especially with large datasets, when land is in the view, etc.

The idea with "fastplot" is to provide a rudimentary plot that is drawn very fast and provides few options - for advanced plotting, the other already existing functions should be used.

A couple of points:

I agree that it has quite some overlap with other existing functions, but I think that for large datasets, it is convenient to have a "quick and efficient" plot function to try in just one call in early data exploration. If you do not want to merge this e.g. because it is "redundant ad hoc functionality for a specific use case", maybe we should think about how the existing functions could get a flag or similar to optionally get this kind of behavior? :)

jerabaul29 commented 3 weeks ago

(test coming soon)

gauteh commented 3 weeks ago

We have the land='mask' options to plot, and you can also plot in cartesian coordinates if you remove the CRS. I'm positive, but I am not sure what the difference is to scatter with no map?

gauteh commented 3 weeks ago

You can also plot without land

jerabaul29 commented 3 weeks ago

Maybe I just missed some of the options to turn stuff off and get scatter to go faster ^^ .

One other thing with scatter is that I was getting grey only plots I think, while it is a bit easier to visualize with different colors maybe? I can have an extra look at the different options in the days to come and summarize it here :) .

gauteh commented 3 weeks ago

Scatter is not so well tested, so it has less functionality than lines. But the idea is to not use different colors when there are many trajectories (> 100). Maybe we can improve scatter a bit, or split things up so that you get this functionality.

jerabaul29 commented 3 weeks ago

I have had a look at changing the scatter method directly by adding a color_by_trajectory_rank arg defaulting to False and some changes into the method body that look like:

        if 'color_by_trajectory_rank':
            numb = self.ds.sizes['trajectory']
            colormap_mapper = ColormapMapper(plt.get_cmap("viridis"), 0, self.ds.sizes['trajectory'])
            colors = np.transpose(np.vectorize(colormap_mapper.get_rgb)(np.arange(0, self.ds.sizes['trajectory'], 1.0)))
            colors = np.repeat(colors, len(self.ds.obs))
            kwargs['c'] = colors
            del kwargs['color']

where


class ColormapMapper:
    """A mapper from values to RGB colors using built in colormaps
    and scaling these."""

    def __init__(self, cmap, vmin, vmax, warn_saturated=False):
        """cmap: the matplotlib colormap to use, min: the min value to be plotted,
        max: the max value to be plotted."""
        self.vmin = vmin
        self.vmax = vmax
        self.warn_saturated = warn_saturated
        norm = mpl.colors.Normalize(vmin=vmin, vmax=vmax)
        self.normalized_colormap = cm.ScalarMappable(norm=norm, cmap=cmap)

    def get_rgb(self, val):
        """Get the RGB value associated with val given the normalized colormap
        settings."""
        if self.warn_saturated:
            if val < self.vmin:
                print("ColormapMapper warning: saturated low value")
            if val > self.vmax:
                print("ColormapMapper warning: saturated high value")

        return self.normalized_colormap.to_rgba(val)

if you want that we look further into this direction, the (for now not fully working and partially broken) attempt is available at:

https://github.com/jerabaul29/trajan/tree/feat/fastplot_on_scatter

However 1) this seems a bit slow compared to the fastplot method, 2) I hit some issues with different kinds of datasets and dimension sizes, for example when using ragged arrays (I am not sure that the auto-expansion is performed).

I wonder if it may be simplest to have a fastplot method as above that we can tune for speed without covering every corner case, rather than integrating complex logics into the other pre-existing plotting utilities?