Closed Nickster258 closed 1 year ago
Hello,
It is correct that SlowlyChangingDimension
performs type 1 updates to all versions. This is intended behavior, but we agree that it could make sense to have an option that could prevent that. It should then be possible to specify this for each of the type 1 attributes. We will look into how we can add this functionality. Can you share more information about your use case for this?
Thanks.
Great to hear!
An example I have is as follows:
Suppose you have a data source of "all active employees". People can change departments, which would incur a type 2 change. It also has a variable that keeps track of an employee's "number of days worked". This variable changes on an almost daily basis, depending on if that employee is on vacation or other variables, so it isn't necessarily computable. In this case, I would hope that when the employee changes from X department to Y, their number of days worked remains unchanged for its record of X.
This isn't a perfect example, but I hope it relays the behavior I was thinking of.
It would be amazing if there was an option to "freeze" older versions in place, as a representation of what that record exactly looked like at that point in time.
Thanks!
@Nickster258 PR #66 adds support for only updating the latest version for some of the attributes when performing type 1 updates to SlowlyChangingDimension
, can you see if this works for your use case?
It seems to work, except that if the row being updated with a type 2 change also has a type 1 change, it will update the previous version and the latest version with the type 1 change. Would it be possible to only apply the type 1 change to the latest version, if latest version also has a type 2 change? This is when using one type 1 attribute set to False
.
Good catch, this should now be fixed by commit a2edd60. Specifically, if a type1 update of an attribute for which only the latest version should be updated (False
) and a type2 update happens at the same time, the type1 update is skipped. Type1 updates that change all version (True
) should not be affected. Can you check that it now works for your use case?
Perfect! My testing sees type1 updates (False
) not changing previous versions if a type2 change also happens.
Thanks to both of you for adding this in! I am hoping to use this for a data warehousing project, and this change makes that more feasible given our data and constraints.
Good to hear, I have closed the issue.
Hello,
I am working with a
SlowlyChangingDimension
table and have a field where type one changes are being tracked. This field is added totype1atts
astype1atts=['timestamp']
. Unfortunately, this seems to update previous versions also.Is there a way to prevent type one changes from modifying older versions? I was expecting old versions of data to be "frozen" in place.
If this is intended behavior, could there be an option to prevent type one changes from effecting older versions?