kedro-org / kedro-plugins

First-party plugins maintained by the Kedro team.
Apache License 2.0
87 stars 76 forks source link

kedro-datasets: Rename MatplotlibWriter to MatplotlibDataset #353

Open Galileo-Galilei opened 10 months ago

Galileo-Galilei commented 10 months ago

Description

MatplotlibWriter has been the only dataset with inconsistent naming for a while. There are a couple of more subtle issues with it still to be fixed (see https://github.com/kedro-org/kedro-plugins/issues/529), but I think we should use the dataset renaming (which is a breaking change whatever) as an opportunity to make it consistent with the rest of the codebase

Context

Consistency is better :)

Possible Implementation

Rename MatplotlibWriter to MatplotlibDataset.

Possible Alternatives

Don't do it ;)

This is likely a great ticket for Hacktoberfest if the project is eligible?

noklam commented 10 months ago

Nice idea, I asked in the team and start tagging issues.

astrojuanlu commented 10 months ago
  1. This affects Kedro-Viz, since they use MatplotlibWriter in various places. To retain backwards compatibility, they would probably have to do some try ... except with imports. cc @rashidakanchwala @tynandebold
  2. I think this goes to some of our findings in https://github.com/kedro-org/kedro/issues/1936: that we don't really intend to implement _load for certain datasets (including this one) because they're meant to be artifacts and not really full I/O components.

Regardless, for consistency I agree we should rename it 👍🏽

noklam commented 10 months ago

https://linen-slack.kedro.org/t/15875966/i-have-a-plot-saved-as-an-image-in-my-catalog-yml-and-the-fi#6d98204f-5d9f-497b-88df-80a23980dcfa

An user ask why load is not supported? Maybe this is the reason why it is called Writer but not Dataset. I don't use this dataset myself so I can't remember is it always like this.

astrojuanlu commented 10 months ago

Maybe we could have a MatplotlibFigureDataset (that saves and loads figures objects as pickles) (although maybe we shouldn't and we should tell users to use PickleDataset for this) and MatplotlibImageDataset (that only saves images and has no _load)

Galileo-Galilei commented 8 months ago

Actually it totally makes sense that some dataset don't have _save method (API Dataset didn't for a while, SQLQueryDataset still doesn't...). Can I start a PR for this or the impact is too high on kedro-viz?

astrojuanlu commented 8 months ago

Yeah I don't think it's a problem that some datasets don't have _save. What's problematic is the monkeypatching that kedro-viz does on some of them https://github.com/kedro-org/kedro-viz/issues/1352

About the impact on Kedro Viz, cc @rashidakanchwala @tynandebold

rashidakanchwala commented 8 months ago

Yes, it will. We will be looking at making sure Kedro-viz is not so coupled with Kedro-datasets. And we are going to prioritise this work in the second half of November most likely. Until then, we could pin kedro-viz to kedro-datasets to the version it would work with.

astrojuanlu commented 5 months ago

I don't think this is a good first issue, there's some uncertainty still https://github.com/kedro-org/kedro-plugins/issues/353#issuecomment-1740974418 removing the label, when we have more clarity we'll update this issue