coreylammie / MemTorch

A Simulation Framework for Memristive Deep Learning Systems
GNU General Public License v3.0
139 stars 51 forks source link

The accuracy of my MDNN is too low, is there something wrong? #156

Closed zzzqqw closed 3 days ago

zzzqqw commented 11 months ago

Hello, I trained a simple network to recognize the MNIST dataset with an accuracy of 0.97 before converting the network to MDNN. But the accuracy of MDNN is only around 0.10. May I ask what the reason is? The code is as follows:

from torchvision.datasets import MNIST
from torch.utils.data import DataLoader
import torch
import torch.nn as nn
import numpy as np
from torchvision import datasets, transforms
import memtorch
import pandas as pd
import copy
from memtorch.mn.Module import patch_model
from memtorch.map.Parameter import naive_map
import os

class Model(nn.Module):
    def __init__(self):
        super(Model,self).__init__()
        self.linear1 = nn.Linear(784,256)
        self.linear2 = nn.Linear(256,64)
        self.linear3 = nn.Linear(64,10)

    def forward(self,x):
        x = x.view(-1,784)
        x = torch.relu(self.linear1(x))
        x = torch.relu(self.linear2(x))
        x = torch.relu(self.linear3(x))
        return x

model = Model()
device = torch.device('cuda') if torch.cuda.is_available() else torch.device('cpu')
model = model.to(device)
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(model.parameters(), 0.8)

if not os.path.exists('saved_model'):
    os.mkdir('saved_model')

def train():
    for index,data in enumerate(train_loader):
        input,target = data
        input, target = input.to(device), target.to(device)
        optimizer.zero_grad()
        y_predict = model(input)
        loss = criterion(y_predict,target)
        loss.backward()
        optimizer.step()
        if index % 100 == 0:
            torch.save(model.state_dict(),"saved_model/simpleCNN_model_MNIST.pth")
            print("LOSS:%.2f" % loss.item())

def test(model1):
    correct = 0
    total = 0
    acc = 0
    model1.eval()
    with torch.no_grad():
        for data in test_loader:
            input,target = data
            input, target = input.to(device), target.to(device)
            output=model1(input)
            output = output.to(device)
            probability,predict=torch.max(output.data,dim=1)
            total += target.size(0)
            correct += (predict == target).sum().item()
        print("Accuracy:%.2f" % (correct / total))
        acc = correct / total
    return acc

transform = transforms.Compose({
    transforms.ToTensor(),
    transforms.Normalize((0.1307,),(0.3081))
})
train_data = MNIST(root='./datat',train=True,download=True,transform=transforms.ToTensor())
train_loader = DataLoader(train_data,shuffle=True,batch_size=64)
test_data = MNIST(root='./data',train=False,download=True,transform=transforms.ToTensor())
test_loader = DataLoader(test_data,shuffle=False,batch_size=64)

"""
    Training
"""
for epoch in range(5):
    train()
    test(model)

"""
    DNN TO MDNN
"""
r_on = 1.4e4
r_off = 5e7

model.load_state_dict(torch.load("saved_model/simpleCNN_model_MNIST.pth"), strict=True)

_ = test(model) # print the best accuarcy

def trial(r_on, r_off, tile_shape, ADC_resolution, sigma):
    model_ = copy.deepcopy(model)
    reference_memristor = memtorch.bh.memristor.VTEAM
    if sigma == 0.:
        reference_memristor_params = {'time_series_resolution': 1e-10, 'r_off': r_off, 'r_on': r_on}
    else:
        reference_memristor_params = {'time_series_resolution': 1e-10,
                                      'r_off': memtorch.bh.StochasticParameter(loc=r_off, scale=sigma * 2, min=1),
                                      'r_on': memtorch.bh.StochasticParameter(loc=r_on, scale=sigma, min=1)}

    patched_model = patch_model(copy.deepcopy(model_),
                                memristor_model=reference_memristor,
                                memristor_model_params=reference_memristor_params,
                                module_parameters_to_patch=[torch.nn.Linear],
                                mapping_routine=naive_map,
                                transistor=True,
                                programming_routine=memtorch.bh.crossbar.Program.naive_program,
                                scheme=memtorch.bh.Scheme.DoubleColumn,
                                tile_shape=tile_shape,
                                max_input_voltage=0.3,
                                ADC_resolution=int(ADC_resolution),
                                ADC_overflow_rate=0,
                                quant_method='linear')

    patched_model.tune_()
    return test(patched_model)

df = pd.DataFrame(columns=['tile_shape', 'ADC_resolution', 'sigma', 'test_set_accuracy'])
tile_shape = [(256, 64)]
ADC_resolution = np.linspace(2, 10, num=5, endpoint=True, dtype=int)
sigma = np.logspace(6, 7, endpoint=True, num=5)
for tile_shape_ in tile_shape:
    for ADC_resolution_ in ADC_resolution:
        for sigma_ in sigma:
            print('tile_shape: %s; ADC_resolution: %d; sigma: %d' % (tile_shape_, ADC_resolution_, sigma_))
            df = df.append({'tile_shape': tile_shape_,
                            'ADC_resolution': ADC_resolution_,
                            'sigma': sigma_,
                            'test_set_accuracy': trial(r_on, r_off, tile_shape_, ADC_resolution_, sigma_)},
                           ignore_index=True)
            df.to_csv('simpleCNN_MDNN.csv', index=False)
RTCartist commented 10 months ago

@zzzqqw Have you solved it? I think this problem is caused by the linear layer transformation from DNN to MDNN. I am handling a CNN regression neural network, and I always find it false or resulting in bad accuracy when I patch the linear layer.

RTCartist commented 10 months ago

@zzzqqw You can try to remove the tile_shape = [(256, 64)] and max_input_voltage=0.3 during patching the model, and the results may be great.

charanecer12 commented 9 months ago

hi , i have tried installing memtorch package by all given methods,but it is throwing error during the process i have attached the screenshot of it,please do help me out or please do guide the process u have followed to install and to proceed further

err_mem
spectacles9468 commented 7 months ago

In the setup file, one of the requirements is sklearn, but it's name has changed to scikit-learn, change it

stale[bot] commented 5 months ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

24367452 commented 4 months ago

Hello, I also encountered the problem of low accuracy. Is it because I used the method of memtorch incorrectly? If you could help me, I would be very grateful

memristor_model = SimpleCNN()
memristor_model.load_state_dict(torch.load("D:MemTorch_CNN\memristor_cnn\model.ckpt"), strict=False)
# Create new reference memristor
reference_memristor = memtorch.bh.memristor.VTEAM
reference_memristor_params = {"time_series_resolution": 1e-10}
memristor = reference_memristor(**reference_memristor_params)
memristor.plot_hysteresis_loop()

#
patched_model = patch_model(copy.deepcopy(memristor_model),
                            memristor_model=reference_memristor,
                            memristor_model_params=reference_memristor_params,
                            module_parameters_to_patch=[torch.nn.Conv1d,torch.nn.Linear],
                            mapping_routine=naive_map,
                            transistor=True,
                            programming_routine=None,
                            tile_shape=(128, 128),
                            max_input_voltage=1.0,
                            ADC_resolution=8,
                            ADC_overflow_rate=0.,
                            quant_method='linear')

patched_model.eval()
with torch.no_grad():
    Y_train_pre = []
    all_Y_train = []
    for X_batch, Y_batch in train_loader:
        X_batch = X_batch.view(X_batch.size(0), 1, 4)
        outputs = patched_model(X_batch)
        Y_train_pre.append(outputs)
        all_Y_train.extend(Y_batch.numpy())
    Y_train_pre = torch.cat(Y_train_pre, dim=0).numpy()

    Y_test_pre = []
    all_Y_test = []
    for X_batch, Y_batch in test_loader:
        X_batch = X_batch.view(X_batch.size(0), 1, 4)
        outputs = patched_model(X_batch)
        Y_test_pre.append(outputs)
        all_Y_test.extend(Y_batch.numpy())
    Y_test_pre = torch.cat(Y_test_pre, dim=0).numpy()

Y_train_pre = Y_train_pre * label_std + label_mean
Y_test_pre = Y_test_pre * label_std + label_mean
Y_train = np.array(all_Y_train)
Y_test = np.array(all_Y_test)

Y_train = Y_train * label_std + label_mean
Y_test = Y_test * label_std + label_mean

Y_train_error = np.sqrt(np.sum((Y_train - Y_train_pre) ** 2, axis=1))
Y_test_error = np.sqrt(np.sum((Y_test - Y_test_pre) ** 2, axis=1))

print('Y_train_error_max=', np.max(Y_train_error))
print('Y_train_error_min=', np.min(Y_train_error))
print('Y_train_error_mean=', np.mean(Y_train_error))

print('Y_test_error_max=', np.max(Y_test_error))
print('Y_test_error_min=', np.min(Y_test_error))
print('Y_test_error_mean=', np.mean(Y_test_error))

RMSE = np.sqrt(mean_squared_error(Y_test, Y_test_pre))
print('RMSE=', RMSE)

The output of the above code is as follows, which seems to be a fixed value

[-0.1142537  -0.07591234]
 [-0.1142537  -0.07591234]
 [-0.1142537  -0.07591234]
 [-0.1142537  -0.07591234]
 [-0.1142537  -0.07591234]
 [-0.1142537  -0.07591234]
 [-0.1142537  -0.07591234]
 [-0.1142537  -0.07591234]
24367452 commented 4 months ago

Hello, can you help me check where the problem lies @RTCartist

RTCartist commented 4 months ago

Hello, can you help me check where the problem lies @RTCartist Please try to modulate the tile_shape and max_input_voltage parameters during patching the model. From where i am sitting, i think the mechanism of MemTorch is to add additional matrix calculation into original ideal software neural network, which is based on the parameters during patching the model. And these parameters influences the results a lot, and it seems it will encounter some problems or faults under some situations. So, please try to modify the parameters. Hope it can work well.

24367452 commented 4 months ago

您好,您能帮我检查一下问题出在哪里吗 请在修补模型时尝试调节tile_shape和max_input_voltage参数。从我所处的位置来看,我认为 MemTorch 的机制是在原始理想软件神经网络中添加额外的矩阵计算,该算法基于修补模型期间的参数。而这些参数对结果影响很大,在某些情况下似乎会遇到一些问题或故障。因此,请尝试修改参数。希望它能很好地工作。

Thank you for your answer. As you mentioned, the parameters have a significant impact

24367452 commented 4 months ago

Hello, I'm sorry to bother you again because I really can't find a solution to the problem. When the finite conductance state of non ideal factors changes, the accuracy of the model does not change, and even when the conductance state is set to 0, there is no change. I think it is because the true quantization part, memorchid_bindings. quantize (tensor, nquant_levels=quant, min=min, max=max), has not been executed, so there will be no impact. Do you have any solution.@RTCartist

24367452 commented 4 months ago

@RTCartist 你好,能帮我看一下是为什么吗?

RTCartist commented 4 months ago

@RTCartist 你好,能帮我看一下是为什么吗?

I am not familiar with this problem. sorry that can't help you.

24367452 commented 4 months ago

@RTCartist 你好,能帮我看一下是为什么吗?

我不熟悉这个问题。对不起,这帮不了你。

Thank you for your attention and answer!This question is indeed difficult to solve.

RTCartist commented 4 months ago

This repo really needs maintenance. Why not use other architectures to simulate the memristor array for ai application? such as MNSIM.

On Wed, 17 Jul 2024 at 14:06, Nanchen @.***> wrote:

@RTCartist https://github.com/RTCartist 你好,能帮我看一下是为什么吗?

我不熟悉这个问题。对不起,这帮不了你。

Thank you for your attention and answer!This question is indeed difficult to solve.

— Reply to this email directly, view it on GitHub https://github.com/coreylammie/MemTorch/issues/156#issuecomment-2232496473, or unsubscribe https://github.com/notifications/unsubscribe-auth/AX5GMGGZ2RM2MMHPZWI7EH3ZMYCW7AVCNFSM6AAAAABBDDPNYOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMZSGQ4TMNBXGM . You are receiving this because you were mentioned.Message ID: @.***>

24367452 commented 4 months ago

这个回购确实需要维护。为什么不使用其他架构来 模拟用于AI应用的忆阻器阵列?例如 MNSIM。 2024 年 7 月 17 日星期三 14:06,Nanchen @.>写道: @RTCartist https://github.com/RTCartist 你好,能帮我看一下是为什么吗? 我不熟悉这个问题。对不起,这帮不了你。 感谢您的关注和回答!这个问题确实很难 解决。 — 直接回复此邮件,在 GitHub 上查看 <#156 (评论)>, 或取消订阅 https://github.com/notifications/unsubscribe-auth/AX5GMGGZ2RM2MMHPZWI7EH3ZMYCW7AVCNFSM6AAAAABBDDPNYOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMZSGQ4TMNBXGM . 你收到这个是因为你被提到了。消息 ID: @.>

I did try another architecture a few days ago, IBM AIHWKIT. But after installation, there was also a problem that I couldn't solve when running, so I came back to try memtorch.

stale[bot] commented 2 months ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.