lvgig / tubular

Python package implementing transformers for pre processing steps for machine learning.
https://tubular.readthedocs.io/en/latest/index.html
BSD 3-Clause "New" or "Revised" License
37 stars 14 forks source link

Consolidate DateDiffLeapYearTransormer and DateDifferenceTransformer #244

Open davidhopkinson26 opened 2 months ago

davidhopkinson26 commented 2 months ago

What

Consolidate the DateDiffLeapYearTransormer and DateDifferenceTransformer into a single transformer with options to cover the functionality of both.

Why

It would be more user friendly if these two transformers which appear to offer similar functionality were consolidated.

How

These two transformers are semantically very similar but function in different ways which for now necessitates different logic.

DateDiffLeapYearTransormer returns integer year date differences (i.e. age) factoring in leap years.

DateDifferenceTransformer calculates difference between datetimes in days/ hours/ minutes/ seconds. M and Y were removed as options in #123 due to M and Y values not being supported by np.timedeltas

To resolve this issue we need to consider ways of bringing the logic of these together in a way which can support years/ months/ days/ hours/ minutes/ seconds. We need to be able to account for calendar months and leap years. Ideally we would not have forking logic for this and would use a single logical framework for all.

Currently unclear how best to approach this so contributions and ideas are welcome!