usc-isi-i2 / dsbox-ta2

The DSBox TA2 component
MIT License
11 stars 6 forks source link

date featurizer is does not detect dates in LL1_336_MS_Geolife_transport_mode_prediction dataset #135

Closed proska closed 6 years ago

proska commented 6 years ago

the learned mapping in the cleaning featerizer is printed below. However the dataset has date information in its column 0 which is not detected.

(Pdb) self._mapping
{'punctuation_columns': {'columns_to_perform': [0, 3], 'split_to': [6, 2]}}
proska commented 6 years ago

This is the dataset's sample date format

                  date_time  user_id        track_id  \
1623951  2011-05-07 08:28:04       67  20110506233648   
3261125  2008-06-10 01:48:42      112  20080610014116   
3032404  2009-05-07 09:42:41       85  20090507091901   
273703   2008-09-29 01:59:59       10  20080928160000   
1203318  2011-08-28 13:53:50       65  20110828011121 
szeke commented 6 years ago

We are working on a fix, will test soon

RqS commented 6 years ago

Fixed in latest cleaning