关于AutoFi数据预处理问题

asyu17 commented 1 year ago

杨老师您好，非常感谢您在WiFi感知方向做的一系列工作，这些工作极大的帮助了我们进行后续内容的研究。

我现在正在试图将AutoFi模型引入到我的对比方法当中，但是我目前遇到了两个问题。

说明： AutoFi代码中的参数和模型结构似乎和论文存在差异，因此我参照了AutoFi论文对kde loss扩大了10倍，同时也使用了论文里提到的模型结构。模型结构代码如下：

class AutoFi_CNN_encoder(nn.Module):
    def __init__(self, hidden_states=128):
        super(AutoFi_CNN_encoder, self).__init__()
        self.encoder = nn.Sequential(
            # input size: (3,114, 114)
            nn.Conv2d(1, 32, (15, 15), stride=9),
            nn.ReLU(True),
            nn.Conv2d(32, 32, (3, 3), stride=1),
            nn.ReLU(True),
            nn.MaxPool2d(kernel_size=(1, 1), stride=(1, 1)),

            nn.Conv2d(32, 64, (3, 3), stride=(1, 1)),
            nn.ReLU(True),
            nn.Conv2d(64, 96, (3, 3), stride=(1, 1)),
            nn.ReLU(True),
            nn.MaxPool2d(kernel_size=(1, 1), stride=(1, 1)),
        )

        self.mapping = nn.Linear(96 * 6 * 6, hidden_states)
        self.bn = nn.BatchNorm1d(hidden_states)
        self.conv_channel = 96
        self.conv_feat_num = 36

    def forward(self, x, flag='unsupervised'):
        if type(x) == type(np.array([])):
            x = np.transpose(x, (0, 3, 1, 2))
            x = torch.Tensor(x)
        x = x.to(torch.float32)
        x = self.encoder(x)
        # classifier
        x = x.reshape(-1, self.conv_channel * self.conv_feat_num)
        return x

class AutoFi_model(nn.Module):
    def __init__(self, num_classes):
        super(AutoFi_model, self).__init__()
        self.encoder = AutoFi_CNN_encoder()
        self.classifier = nn.Sequential(
            nn.Linear(self.encoder.conv_channel * self.encoder.conv_feat_num, 128),
            nn.Linear(128, num_classes),
            nn.Softmax()
        )

    def forward(self, x1, x2, flag='unsupervised'):
        if flag == 'supervised':
            x1 = self.encoder(x1, flag=flag)
            x2 = self.encoder(x2, flag=flag)
            y1 = self.classifier(x1)
            y2 = self.classifier(x2)
            return y1, y2
        x1 = self.encoder(x1)
        x2 = self.encoder(x2)
        return x1, x2

1、AutoFi读入的数据似乎是非归一化的，这样在对归一化后的数据进行处理时，此时噪声可能会淹没样本数据。以下是我在 self_supervised.py种截取到的部分代码。

            x, y = data
            x, y = x.to(device), y.to(device)
            x1 = gaussian_noise(x, random.uniform(0, 2.0))
            x2 = gaussian_noise(x, random.uniform(0.1, 2.0))

2、考虑到对噪声进行缩放后，模型在ntuhid数据集上依然训练失败这是加噪声前ntuhid的样本这是加缩放到[0, 0.2]噪声后ntuhid的样本这是预训练过程中的loss，目前发现预训练的loss几乎不会变化这是有监督训练过程中的测试结果，在训练过程中几乎没有发生变化。（在其他数据集上有些能到70+%的准确率）