hazdzz / STGCN

The PyTorch implementation of STGCN.
GNU Lesser General Public License v2.1
484 stars 106 forks source link

Questions about zscore #10

Closed ystephenai closed 3 years ago

ystephenai commented 3 years ago

Hi, I have to admit that your code inspires me a lot and I have definitely stared it. However, could you explain about this piece of code below in main_road_traffic.py about zscore? I knew that the zscore normalized the features across the data size, so if the input data's dimension is like [batch, features], each feature will be normalized to mean =0, std = 1. But in your code here, if I am right, the dimension of variable "train" is [sequence length, num_of nodes]. If you apply zscore here, each node will be normalized to (0, 1) ? that's actually wierd for me because if we normalize each nodes accross the time step,won't we loss the spatial information?

train, val, test = dataloader.load_data(data_path, len_train, len_val)
    zscore = preprocessing.StandardScaler()
    train = zscore.fit_transform(train)
    val = zscore.transform(val)
    test = zscore.transform(test)
hazdzz commented 3 years ago

Hi, I have to admit that your code inspires me a lot and I have definitely stared it. However, could you explain about this piece of code below in main_road_traffic.py about zscore? I knew that the zscore normalized the features across the data size, so if the input data's dimension is like [batch, features], each feature will be normalized to mean =0, std = 1. But in your code here, if I am right, the dimension of variable "train" is [sequence length, num_of nodes]. If you apply zscore here, each node will be normalized to (0, 1) ? that's actually wierd for me because if we normalize each nodes accross the time step,won't we loss the spatial information?

train, val, test = dataloader.load_data(data_path, len_train, len_val)
    zscore = preprocessing.StandardScaler()
    train = zscore.fit_transform(train)
    val = zscore.transform(val)
    test = zscore.transform(test)

I don't understand what you mean. Why using z-score would lose the spatial information? Could you explain your opinions more specifically? You really confused me.

ystephenai commented 3 years ago

Ok I will give you an example of what I am thinking:

For example, if your training matrix is: [[1, 2, 3], [2, 5, 3], [0, 1, 5], [2, 5, 4]] after doing transformation zscore.fit_transform(train) you will get matrix train: array([[-0.30151134, -0.70014004, -0.90453403], [ 0.90453403, 0.98019606, -0.90453403], [-1.50755672, -1.26025208, 1.50755672], [ 0.90453403, 0.98019606, 0.30151134]])

and you can find the sum of each columns equals to 0. which in your case means that the mean of each position across time steps equals to 0.

ystephenai commented 3 years ago

And I don't really think that would be correct, for example, if one position is always (across time T) higher than another position, how do you capture that spatio information if you normalize the mean equals to 0. Could you get what I mean?

hazdzz commented 3 years ago

For the z-score normalized matrix you offered as

[[-0.30151134, -0.70014004, -0.90453403], [ 0.90453403, 0.98019606, -0.90453403], [-1.50755672, -1.26025208, 1.50755672], [ 0.90453403, 0.98019606, 0.30151134]]

, the spatial information is not lost through time. I think you were thinking of whether the temporal information is lost or not. Using z-score normalization is not wrong, it is a common method in time prediction models.

ystephenai commented 3 years ago

ok, but actually my example didn't well explain my thought. if we consider a more extreme case: [[0,1,2], [0,1,2], [0,1,2]] after z-score or any normalization method, we always get [0,0,0] [0,0,0] [0,0,0] and then we totally lose the spatial information in this case

hazdzz commented 3 years ago

ok, but actually my example didn't well explain my thought. if we consider a more extreme case: [[0,1,2], [0,1,2], [0,1,2]] after z-score or any normalization method, we always get [0,0,0] [0,0,0] [0,0,0] and then we totally lose the spatial information in this case

This extreme case you mentioned would not occur in reality.

hazdzz commented 3 years ago

Finally, I knew what you mean.

import numpy as np
from scipy import stats

a = np.array([
    [0,1,2],
    [0,1,2],
    [0,1,2]
])

b = stats.zscore(a, axis=0)
print(b)

[[nan nan nan] [nan nan nan] [nan nan nan]]

In this case, you should use parameter axis = 1 as below

b = stats.zscore(a, axis=1)
print(b)

[[-1.22474487 0. 1.22474487] [-1.22474487 0. 1.22474487] [-1.22474487 0. 1.22474487]]

ystephenai commented 3 years ago

Finally, I knew what you mean.

import numpy as np from scipy import stats

a = np.array([ [0,1,2], [0,1,2], [0,1,2] ])

b = stats.zscore(a, axis=0) print(b)

[[nan nan nan] [nan nan nan] [nan nan nan]]

In this case, you should use parameter axis = 1 as below

b = stats.zscore(a, axis=1) print(b)

[[-1.22474487 0. 1.22474487] [-1.22474487 0. 1.22474487] [-1.22474487 0. 1.22474487]]

yes, but don't you think you should use the same stats.zscore in your reconstructed code? indead of using zscore.fit_transform? that actually give two different results

hazdzz commented 3 years ago

No, as I mentioned, this extreme case would not occur in reality. So I don't need to change my code, and you don't need to worry about it.

ystephenai commented 3 years ago

ok