thekingofkings / chicago-crime

Crime correlation anaysis
MIT License
11 stars 3 forks source link

The self-flow in taxi flow are removed when generating embedding #29

Closed thekingofkings closed 7 years ago

thekingofkings commented 7 years ago

The taxi-flow function sets self-flow f_{i,i} to 0. This is required in the regression-based method.

However, when learning graph embedding (#27), self-flow is an important feature that cannot be omitted.

thekingofkings commented 7 years ago

With self-flow, the best setting is

  1. Consider 8 neighbors
  2. Vector size is 10 (5 from 1st order, 5 from 2nd order neighbors)
  3. train on 20 millions graph samples
  4. Number of negative sample is 5
thekingofkings commented 7 years ago

The leave one out error is much lower. In year 2010 data,

MAE: 465.89, MRE: 0.3477

embedding-c3