tarotez / sleepstages

17 stars 12 forks source link

size mismatch errors #4

Closed mitometa closed 1 year ago

mitometa commented 1 year ago

Thank you for sharing the source codes!

I have tried to run demo, but failed with size mismatch errors for torch in both ubuntu and mac osx systems.


$ python app.py m

output:

in ParameterSetup, paramFilePath = ../data/params/params.json
W98DEW , UTSN-L , 128 , 10
6AJRX4 , UTSN , 128 , 10
demo mode: reading inputFileID= sample
in ParameterSetup, paramFilePath = ../data/params/params.json
generating extractor: 
in ParameterSetup, paramFilePath = ../data/params/params.json
in ParameterSetup, paramFilePath = ../data/finalclassifier/params.W98DEW.json
model_path =  ../data/finalclassifier/weights.W98DEW.pkl
loading weights in deepClassifier.py from ../data/finalclassifier/weights.W98DEW.pkl
in deepClassifier.generateModel, params.networkType = cnn_lstm
using CNN-LSTM for raw data
filter_nums = [64, 64, 64, 64, 64, 64, 64, 64]
kernel_sizes = [9, 9, 9, 9, 7, 7, 7, 7]
strides = [1, 2, 2, 2, 2, 2, 2, 2]
compiling the model
Exception in self.client = ...
Traceback (most recent call last):
  File "app.py", line 512, in <module>
    mainapp = RemApplication(host, port, args)
  File "app.py", line 80, in __init__
    self.initUI()
  File "app.py", line 502, in initUI
    raise e
  File "app.py", line 486, in initUI
    samplingFreq=self.model_samplingFreq, epochTime=self.model_epochTime)
  File "/sleepstages-main/code/classifierClient.py", line 68, in __init__
    self.setStagePredictor(classifierID)
  File "/sleepstages-main/code/classifierClient.py", line 164, in setStagePredictor
    classifier.load_weights(model_path)
  File "/sleepstages-main/code/deepClassifier.py", line 637, in load_weights
    self.model.load_state_dict(torch.load(weight_path, map_location='cpu'), False)
  File "/opt/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1605, in load_state_dict
    self.__class__.__name__, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for cnn_lstm:
    size mismatch for batns_for_stft.1.weight: copying a param with shape torch.Size([8]) from checkpoint, the shape in current model is torch.Size([16]).
    size mismatch for batns_for_stft.1.bias: copying a param with shape torch.Size([8]) from checkpoint, the shape in current model is torch.Size([16]).
    size mismatch for batns_for_stft.1.running_mean: copying a param with shape torch.Size([8]) from checkpoint, the shape in current model is torch.Size([16]).
    size mismatch for batns_for_stft.1.running_var: copying a param with shape torch.Size([8]) from checkpoint, the shape in current model is torch.Size([16]).
    size mismatch for batns_for_stft.2.weight: copying a param with shape torch.Size([8]) from checkpoint, the shape in current model is torch.Size([16]).
    size mismatch for batns_for_stft.2.bias: copying a param with shape torch.Size([8]) from checkpoint, the shape in current model is torch.Size([16]).
    size mismatch for batns_for_stft.2.running_mean: copying a param with shape torch.Size([8]) from checkpoint, the shape in current model is torch.Size([16]).
    size mismatch for batns_for_stft.2.running_var: copying a param with shape torch.Size([8]) from checkpoint, the shape in current model is torch.Size([16]).
    size mismatch for batns_for_stft.3.weight: copying a param with shape torch.Size([8]) from checkpoint, the shape in current model is torch.Size([16]).
    size mismatch for batns_for_stft.3.bias: copying a param with shape torch.Size([8]) from checkpoint, the shape in current model is torch.Size([16]).
    size mismatch for batns_for_stft.3.running_mean: copying a param with shape torch.Size([8]) from checkpoint, the shape in current model is torch.Size([16]).
    size mismatch for batns_for_stft.3.running_var: copying a param with shape torch.Size([8]) from checkpoint, the shape in current model is torch.Size([16]).
    size mismatch for convs_for_stft.0.weight: copying a param with shape torch.Size([8, 1, 3, 3]) from checkpoint, the shape in current model is torch.Size([16, 1, 3, 3]).
    size mismatch for convs_for_stft.1.weight: copying a param with shape torch.Size([8, 8, 3, 3]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).
    size mismatch for convs_for_stft.2.weight: copying a param with shape torch.Size([8, 8, 3, 3]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).
    size mismatch for convs_for_stft.3.weight: copying a param with shape torch.Size([8, 8, 3, 3]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).
    size mismatch for batn_combined.weight: copying a param with shape torch.Size([688]) from checkpoint, the shape in current model is torch.Size([736]).
    size mismatch for batn_combined.bias: copying a param with shape torch.Size([688]) from checkpoint, the shape in current model is torch.Size([736]).
    size mismatch for batn_combined.running_mean: copying a param with shape torch.Size([688]) from checkpoint, the shape in current model is torch.Size([736]).
    size mismatch for batn_combined.running_var: copying a param with shape torch.Size([688]) from checkpoint, the shape in current model is torch.Size([736]).
    size mismatch for final_fc_no_lstm.weight: copying a param with shape torch.Size([3, 688]) from checkpoint, the shape in current model is torch.Size([3, 736]).
    size mismatch for fulc_combined_lstm.weight: copying a param with shape torch.Size([32, 688]) from checkpoint, the shape in current model is torch.Size([32, 736]).
mitometa commented 1 year ago

tested in ubuntu with

pytorch-ignite 0.4.10 torch 1.13.0 torchsummary 1.5.1 scikit-learn 1.1.3

tested in mac with: pytorch-ignite 0.4.10 torch 1.12.1 torchsummary 1.5.1 scikit-learn 1.0.2

The errors are robust to be repeated in both systems.

mitometa commented 1 year ago

temporary solution:

$ nano +617 deepClassifier.py

            new_weights = torch.load(weight_path)
            new_weights.pop("batns_for_stft.1.bias")
            new_weights.pop("batns_for_stft.1.weight")
            new_weights.pop("batns_for_stft.1.running_mean")
            new_weights.pop("batns_for_stft.1.running_var")
            new_weights.pop("batns_for_stft.2.bias")
            new_weights.pop("batns_for_stft.2.weight")
            new_weights.pop("batns_for_stft.2.running_mean")
            new_weights.pop("batns_for_stft.2.running_var")
            new_weights.pop("batns_for_stft.3.bias")
            new_weights.pop("batns_for_stft.3.weight")
            new_weights.pop("batns_for_stft.3.running_mean")
            new_weights.pop("batns_for_stft.3.running_var")
            new_weights.pop("convs_for_stft.0.weight")
            new_weights.pop("convs_for_stft.1.weight")
            new_weights.pop("convs_for_stft.2.weight")
            new_weights.pop("convs_for_stft.3.weight")
            new_weights.pop("batn_combined.weight")
            new_weights.pop("batn_combined.bias")
            new_weights.pop("batn_combined.running_mean")
            new_weights.pop("batn_combined.running_var")
            new_weights.pop("final_fc_no_lstm.weight")
            new_weights.pop("fulc_combined_lstm.weight")
            self.model.load_state_dict(new_weights, strict=False)