chrthomsen / pygrametl

Official repository for pygrametl - ETL programming in Python
http://pygrametl.org
BSD 2-Clause "Simplified" License
289 stars 41 forks source link

Prevent contamination of input dictionary #42

Closed iiLaurens closed 2 years ago

iiLaurens commented 2 years ago

I noticed that scdensure for a SCD table manipulates the input row object. For example, the key attribute is added to the dictionary. This could lead to UNIQUE constraint conflicts when scdensure is called for another table where the key attribute names are the same, because the key attribute is preserved even though a new one should be assigned. If the user really wants the key inside the dict, he can assign it himself from the return value of the scdensure function: row[keyatt] = dimTable.scdensure(row)

chrthomsen commented 2 years ago

Thank you for this suggestion. It would, however, break backwards compatability for someone who relies on getting the version number or from/to dates back in the row (and the documentation also notes that there are side-effects on the passed row).

On the other hand, it would be cleaner if scdensure behaved like most other methods and did not modify the passed row.

So as a transition, we could make __init__ take a new optional argument allowsideeffectsonrows with the default value True. When it is False, scdensure should make a copy of the passed row as you suggest.

chrthomsen commented 2 years ago

Closed as #43 makes a backwards compatible solution as described above