DataJunction / dj

A metrics platform.
http://datajunction.io
MIT License
34 stars 15 forks source link

Fix an issue where removing dimension links was invalidating cubes incorrectly and taking a long time #1020

Closed shangyian closed 4 months ago

shangyian commented 4 months ago

Summary

This PR fixes an issue where removing dimension links was both invalidating cubes incorrectly and taking a long time to finish. The main problem was that the logic called get_nodes_with_dimension to find all cubes that used the referenced dimension node.

However, cubes should only be affected if they are downstream of the original node, not if they are just linked to the dimension node on some vertex in the dimensions DAG. With the previous logic, a large number of cubes would get invalidated when dimension links were removed, even though they were still valid. It would also take a long time to complete, since get_nodes_with_dimension is an expensive call.

This change switches the logic to use get_downstreams on the original node, and also checks to make sure that the dimension node is referenced in the cube before invalidating.

Test Plan

Deployment Plan

netlify[bot] commented 4 months ago

Deploy Preview for thriving-cassata-78ae72 canceled.

Name Link
Latest commit 9765f0a3a50a189fc3f57595033a27aecea8eebf
Latest deploy log https://app.netlify.com/sites/thriving-cassata-78ae72/deploys/6654b76f65f3c400086dab82