Closed syou6162 closed 8 months ago
@z3z1ma Could you review this pull request?
Will review this today, sorry for delay. On brief vacation. The premise makes total sense to me.
@z3z1ma Please feel free to comment if you are unclear about the intent of any of the changes.
Thanks for patience. This looks good. The test case too! 🙏 🏆
Very nice addition. Generally I think that a column of the same name should be defined as high up in the tree as possible and it should not change without a change of column name as well, which is therefore essentially a new node. Following this pattern makes dbt osmosis work at its DRYest and most effective. But in the long tail of ongoing usage and maintenance I see what you are saying for sure.
Using force inheritance is too aggressive if a downstream child has a proper description. So this is a happy middle ground I think. We could consider a flag that accepts. comma separated list of column_names to exclude from propagation which gives you intra-model control but I think specifying it in the yaml as you generate the description is nicer.
Using force inheritance is too aggressive if a downstream child has a proper description. So this is a happy middle ground I think. We could consider a flag that accepts. comma separated list of column_names to exclude from propagation which gives you intra-model control but I think specifying it in the yaml as you generate the description is nicer.
@z3z1ma Thank you very much. It would have been nice to discuss the issue beforehand, and then discuss the background and specifications. I apologize for sending the pull request suddenly and without consulting you.
Thanks also for the merge, and I look forward to the next release!
Background
The
--force-inheritance
option is useful for updating columns in the propagator when the description of a column on the data lake side is updated. In the long-term operation of dbt-osmosis, you will often use the--force-inheritance
option, but sometimes you may not want the column description to be updated with this option.Example
Suppose you have a column named
shop_id
in your data lake with a column description of "ID of shop". Suppose you create a data mart about a user by referencingshop_id
in this data lake. Suppose this data mart contains a column calledshop_id
, but you want to include a description for the column "ID of the latest shop where the user made a purchase". In this case, the--force-inheritance
option will overwrite "ID of the latest shop where the user made a purchase" with just "ID of shop" 😭 .Of course, you should always try to name the columns better in the data mart (e.g.,
latest_shop_id
), but due to historical reasons, it may sometimes be difficult to rename columns.What I Changed
Don't update the column description if the column's
meta.osmosis_keep_description
is True, even if the--force-inheritance
option is used. See testing for more detailed specification.I've modified the behavior so that the column description won't be updated if the column's
meta.osmosis_keep_description
is set to True, even when using the--force-inheritance option
. Please refer to the tests for more detailed specifications.