Open gitgud5000 opened 3 weeks ago
Here, I assigned the 99_orphans
layer to one of the previously orphaned datasets.
Interestingly, this did not trigger the circular dependency error (or assigning the 09_model_output
layer either, only when assigning the 08_model_input
layer, which should be allowed). The image shows how all these leaf datasets are now grouped into the newly defined 99_orphans
layer, confirming that these datasets are consistently moved to the bottom-most layer in the stack.
Thank you @gitgud5000 for raising this issue. We will look into it.
Hey, @gitgud5000, @rashidakanchwala and I are trying to reproduce the issue that you encountered, but we couldn't trigger this circular dependency error that you found. Would it be possible for you to share some more information about your project setup?
I've tried to reproduce it on the demo project contained on the kedro-viz repo, assigning some of the previously unassigned nodes to both the reporting
and tracking
layers, and the warning did not appear.
I will try to produce an example in a Kedro project and share it with you soon. @lrcouto
Description
In Kedro Viz, datasets defined in the catalog that are "leaf datasets"—meaning they aren't used as inputs to any other nodes—are forced to be placed in the last layer when no layer is explicitly assigned to them.
In the attached image, datasets in the
08_model_input
layer are positioned correctly because they have explicitly assigned layers. However, datasets in the bottom red square are moved to the last layer (11_calibration
) by default since they are datasets without explicitly defined layers.When I try to assign a specific layer (say
08_model_input
) to these leaf datasets in the catalog, I encounter a circular dependency error. This issue only happens for datasets that are leaf nodes; datasets used as inputs to other nodes behave as expected without triggering this error.Defining specific layers for these leaf datasets results in a circular dependency error, even though datasets output by the same node (and used in subsequent nodes) do not trigger the same issue.
Context
This issue limits the flexibility to organize datasets logically across layers. Leaf datasets, if not assigned layers, get forced into the last layer by default. When assigning layers to these datasets, a circular dependency error occurs, making proper layer management difficult.
Steps to Reproduce
Expected Result
Leaf datasets should respect their assigned layers without triggering circular dependency errors. They shouldn’t default to the last layer (unless intended). I think they should remain in the same layer as their generating node.
Actual Result
Environment
Checklist