Closed thekingofkings closed 8 years ago
In this commit, I fix the transpose
issue in Python
In this older commit, I apply transpose on the social lag as well.
Two transpose make the transpose
problem a bug.
Remove that line in file R\pvalue-evaluation.R
problem fixed
The social flow (LEHD) matrix transpose issue
Background
All confusion rooted in the lag variable calculation:
How the social flow matrix is calculated?
Due to some historical issue, when I initially write my Python code to process the LEHD flow matrix, I have a nested dictionary to track the flow. The first level keys are
source CA ID
and the secondary level keys aredestination CA ID
.How the lag variables calculated?
There are three kinds of normalization. Take normalize by destination as example. The flow matrix
M
is multiplied with column vectory
, i.e. each row of theM
should be the percentage of traffic from allsources
todst
, and sum to 1.Pitfall
According to the top formula, we have
Thus a transpose is needed to make everything consistent.