nnzhan / MTGNN

MIT License
798 stars 222 forks source link

关于“normalization” #39

Open luoxia-wen opened 1 year ago

luoxia-wen commented 1 year ago

代码中关于normalization的部分如下:


    def _normalized(self, normalize):
        # normalized by the maximum value of entire matrix.

        if (normalize == 0):
            self.dat = self.rawdat

        if (normalize == 1):
            self.dat = self.rawdat / np.max(self.rawdat)

        # normlized by the maximum value of each row(sensor).
        if (normalize == 2):
            for i in range(self.m):
                self.scale[i] = np.max(np.abs(self.rawdat[:, i]))
                self.dat[:, i] = self.rawdat[:, i] / np.max(np.abs(self.rawdat[:, i]))

默认是=2的情况,对于这个我有疑问
使用所有数据进行归一化是否存在测试集信息泄露的问题?
yasinuygun commented 1 year ago

This looks like a look-ahead bias. When I use the code, I modified this part to use the max value in the training part only, like self.rawdat[:int(train_ratio * self.rawdat.shape[0])] in np.max calls.