datahub-project / datahub

The Metadata Platform for your Data and AI Stack
https://datahubproject.io
Apache License 2.0
9.93k stars 2.94k forks source link

fix(ingest/browsePathsV2): Emit Container aspect first, to avoid BrowsePathsV2 generation race condition #11813

Closed asikowitz closed 2 weeks ago

asikowitz commented 2 weeks ago

I think it's clear that auto_browse_path_v2 is not the most robust solution to producing browse paths with its two invariants, order and batch. The batch constraint especially is being violated with the introduction of thread pool execution. There are several ways to fully solve this, but I couldn't think of a quick and clean one.

Instead of a full solution, I put in an interim one that should fix the existing issue: emit the Container aspect first, so that auto_browse_path_v2 knows immediately whether a container is a root container or not. The test fails before the change and passes after.

Checklist