h2oai / h2o-tutorials

Tutorials and training material for the H2O Machine Learning Platform
http://h2o.ai
1.48k stars 1.01k forks source link

How to calculate the time differences of two date fields? #163

Open Realvincentyuan opened 2 years ago

Realvincentyuan commented 2 years ago

Overview

Basically I have a few date fields, say day1, day2 stuff and I want to calculate the year/month/day differences between these fields. Direct minus of those fields give a ns format

test_date_df = df['END_DT'] - df['START_DT']
test_date_df.dtype

dtype is dtype('<U')

1 row example:

1.97096e+12

I did not find a proper workaround to convert this 1.97096e+12 into year/month/day, it is great if the operation can be done with native H2O function or I expect this function works with multi-cores so that the out of memory error is not a concern as I have huge volume of data.

With Pandas it is very handy to work this out with multi-processing but just check out if someone has a solution with H2O