Open delvillarjavier opened 1 year ago
This makes me wonder, do you want to also see the code for other datasets listed in catalog.yml
too?
Hello! I was thinking about that, I just want some idea of the data I would get or where is it, in the case there is no "File Path" attribute. :)
Update: Leaving an update here on progress, and potential next steps! As useful context for anyone re-visiting.
We decided to split this issue into two parts:
Reason for this is that we'd need some more time on finalising the design for those other 7 datasets - the SQL-dataset is still technically 'Code', and hence would fall in-line with a 'Show Code' / Code preview window. The other 7 in contrast may only have a URL (e.g. API) or source database, and our plan is to re-think where exactly they should go in the metadata window instead of adding it now and needing to refactor it later.
Current PR implements the 1st task fully in the meantime! https://github.com/kedro-org/kedro-viz/pull/1564
One question (maybe a bit too late, apologies): wouldn't it be easier to display the whole YAML definition of the dataset? So, instead of just the SQL query, the whole thing:
shuttle_id_dataset:
type: pandas.SQLQueryDataset
sql: "select shuttle, shuttle_id from spaceflights.shuttles;"
credentials: db_credentials
that way (1) this works for every dataset, not only SQL ones, and (2) possibly addresses the coupling @merelcht flagged in https://github.com/kedro-org/kedro-viz/pull/1564#issuecomment-1771260062
Update, had a discussion on feature:
Former approach of picking specific metadata's to display, like SQL, increase our dependency on kedro-framework; if the dataset name or any of its attributes changes, then our code would break as well.
There's two workarounds to this: 1) Display everything; we're already recieving a object that contains all of the metadata attributes - we'd barely increase coupling / dependency if we just display everything instead of picking and choosing subsets of it (changes to attribute names or dataset type wouldn't matter - we just care about displaying what we have present)
2) Add a _describe
method on the kedro-datasets end, which could be called by kedro-viz. This method would handle the metadata value display, and would be the most optimal solution (e.g. framework changes would cause that method to fail, letting them know to change it; if the logic lived in kedro-viz, framework could push breaking changes because they'd have little visibility on how it'd affect viz / how viz uses that knowledge).
Ultimate decision was to implement decision 2, since its the most optimal and would have to be done anyway in the future.
Current status of issue is pending / TODO, as we need confirmation when this implementation can be done / if it fits with the next kedro-framework sprint/s. Its PR will be closed for the time being.
Description
Currently Kedro-Viz only previews data from CSVs and Excel files, I would like to view the SQL code that produces that table, not the resulting table.
Context
I use Kedro-Viz very often to review projects status, it would be nice if I can see all the datasources and the SQL code that produces them in one place.
Possible Implementation
Add a toggle in SQL based datasets