MLblog / jads_kaggle

Contains our group's work in various kaggle competitions
MIT License
10 stars 23 forks source link

Split data #110

Closed san89 closed 5 years ago

san89 commented 5 years ago

Summary:

  1. Split the data in train (x,y) and test (x) in function of dates.
  2. Categorical features are reduced in function of 'transactionRevenue'.
  3. Dynamic features are transformed to tables (from Manos work).
  4. I am trying to guarantee that: [a] train(x) and test(x) have the same features in the same order, [b] train(x) and train(y) have the same users.