kedro-org / kedro

Kedro is a toolbox for production-ready data science. It uses software engineering best practices to help you create data engineering and data science pipelines that are reproducible, maintainable, and modular.
https://kedro.org
Apache License 2.0
9.47k stars 875 forks source link

View `dataset` public properties #3936

Open ElenaKhaustova opened 3 weeks ago

ElenaKhaustova commented 3 weeks ago

Description

Users express the need to easily access datasets and examine their available public properties and methods.

We propose developing a method or functionality within the framework that allows users to retrieve datasets and view their public properties and methods in a structured and accessible manner.

It goes beyond redesigning DataCatalogAPI as requires modifications on the Dataset side.

Context

"I think once you've got the data set, it is useful to be able to look at any public property that's already there."

ElenaKhaustova commented 3 weeks ago

@astrojuanlu: "The most important philosophical question here is "opening up" the DataCatalog abstraction and make datasets first-class citizens, and not an implementation detail. This was mentioned as far back as 2022 https://github.com/kedro-org/kedro/issues/1778#issuecomment-1292054256"