kedro-org / kedro-plugins

First-party plugins maintained by the Kedro team.
Apache License 2.0
93 stars 89 forks source link

Should MatplotilbWriter multiple plot functionality be removed in favour of PartitionedDataSet? #529

Open antonymilne opened 2 years ago

antonymilne commented 2 years ago

MatplotlibWriter currently supports 3 different save modes:

There's a recently-added overwrite option associated with the latter two modes (https://github.com/kedro-org/kedro/issues/868). This also exists for PartitionedDataSet.

The current behaviour has some problems:

On the other hand, the ability to save multiple plots rather than define one dataset per plot is essential. I have used it myself many times and seen it used a lot.

So, my question is: should we replace the matplotlib save modes that do multiple plots with instead wrapping MatplotlibWriter in PartionedDataSet? Leaving aside how we do this technically for the moment, would this be a good change to make? i.e. will this be a user-friendly solution here? Will it allow everything we need to allow in terms of functionality?

My suspicion is that the only reason we don't already use PartionedDataSet for this is historical (MatplotlibWriter was added to contrib at the same time PartionedDataSet was added to core).

Tagging @Galileo-Galilei who I suspect will just have the answers here 😀

deepyaman commented 2 years ago

One other (likely unnecessary) discrepancy is that PartitionedDataSet doesn't currently support versioning--either of the overarching or underlying dataset. MatplotlibWriter does for the overarching dataset. Perhaps relevant discussions, although more focused on the underlying dataset: https://github.com/kedro-org/kedro/pull/521.

antonymilne commented 2 years ago

@deepyaman thanks, that is a very pertinent point given that experiment tracking is one of the main motivations here, and that directly relies on versioned datasets to work... So if we were to move to PartitionedDataSet for MatplotlibWriter then we should try and get kedro-org/kedro#521 done.