ubicomplab / rPPG-Toolbox

rPPG-Toolbox: Deep Remote PPG Toolbox (NeurIPS 2023)
https://arxiv.org/abs/2210.00716
Other
482 stars 117 forks source link

TSCAN: size mismatch between model parameter and the pretrained model parameter #156

Closed Changezi001 closed 1 year ago

Changezi001 commented 1 year ago

Hi,

First of all thank you for this amazing work and sharing it.

I want to use the pretrained TSCAN model to get the rppg data from my own videos, but when I load the pretrained weights from PURE_TSCAN.pth or UBFC_TSCAN.pth, I get the following error:

size mismatch for final_dense_1.weight: copying a param with shape torch.Size([128, 16384]) from checkpoint, the shape in current model is torch.Size([128, 57600]).

My code and the output of the code is given below. Could you please tell me why there exists a mismatch between the parameters and how to solve this issue?

import torch from neural_methods.model.TS_CAN import TSCAN from config import get_config def predict_vitals():

# configurations.
config = get_config()
print('Configuration:')
print(config, end='\n\n')

frame_depth = config.MODEL.TSCAN.FRAME_DEPTH
config = config

#if torch.cuda.is_available():
kam_device = torch.device("cuda")  # Use CUDA device
TSCAN_model = TSCAN(frame_depth=frame_depth, img_size=config.TEST.DATA.PREPROCESS.RESIZE.H).to(kam_device)

# Get the state dictionary of the model
TSCAN_model_state_dict = TSCAN_model.state_dict()

print('************ keys and size of kam_model_state_dict ***************')
# Print keys and sizes of parameters
for key, value in TSCAN_model_state_dict.items():
    print(f'Key: {key}\tSize: {value.size()}')
print('***************************')

# Load the saved state_dict
state_dict = torch.load('final_model_release/UBFC_TSCAN.pth')

print('************ keys and size of old state dict ***************')
# Print keys and sizes of parameters
for key, value in state_dict.items():
    print(f'Key: {key}\tSize: {value.size()}')
print('***************************')

# Create a new state_dict with 'module.' removed in keys
new_state_dict = {}
for key, value in state_dict.items():
    if key.startswith('module.'):
        new_key = key[len('module.'):]
    else:
        new_key = key
    new_state_dict[new_key] = value

# Print keys and sizes of parameters
print('************ keys and size of new state dict ***************')
for key, value in new_state_dict.items():
    print(f'Key: {key}\tSize: {value.size()}')
print('***************************')

# Use the new_state_dict in the model
TSCAN_model.load_state_dict(new_state_dict)

if name == "main": predict_vitals()

** Outputs **

Configuration: BASE: [''] DEVICE: cuda:0 INFERENCE: BATCH_SIZE: 4 EVALUATION_METHOD: FFT EVALUATION_WINDOW: USE_SMALLER_WINDOW: False WINDOW_SIZE: 10 MODEL_PATH: LOG: PATH: runs/exp MODEL: DROP_RATE: 0.0 EFFICIENTPHYS: FRAME_DEPTH: 10 MODEL_DIR: PreTrainedModels NAME: PHYSNET: FRAME_NUM: 64 RESUME: TSCAN: FRAME_DEPTH: 10 NUM_OF_GPU_TRAIN: 1 TEST: DATA: BEGIN: 0.0 CACHED_PATH: PreprocessedData DATASET: DATA_FORMAT: NDCHW DATA_PATH: DO_PREPROCESS: False END: 1.0 EXP_DATA_NAME: FILE_LIST_PATH: PreprocessedData/DataFileLists FS: 0 INFO: EXERCISE: [True] GENDER: [''] GLASSER: [True] HAIR_COVER: [True] LIGHT: [''] MAKEUP: [True] MOTION: [''] SKIN_COLOR: [1] PREPROCESS: CHUNK_LENGTH: 180 CROP_FACE: DETECTION: DO_DYNAMIC_DETECTION: False DYNAMIC_DETECTION_FREQUENCY: 30 USE_MEDIAN_FACE_BOX: False DO_CROP_FACE: True LARGE_BOX_COEF: 1.5 USE_LARGE_FACE_BOX: True DATA_AUG: ['None'] DATA_TYPE: [''] DO_CHUNK: True LABEL_TYPE: RESIZE: H: 128 W: 128 METRICS: [] USE_LAST_EPOCH: True TOOLBOX_MODE: TRAIN: BATCH_SIZE: 4 DATA: BEGIN: 0.0 CACHED_PATH: PreprocessedData DATASET: DATA_FORMAT: NDCHW DATA_PATH: DO_PREPROCESS: False END: 1.0 EXP_DATA_NAME: FILE_LIST_PATH: PreprocessedData/DataFileLists FS: 0 INFO: EXERCISE: [True] GENDER: [''] GLASSER: [True] HAIR_COVER: [True] LIGHT: [''] MAKEUP: [True] MOTION: [''] SKIN_COLOR: [1] PREPROCESS: CHUNK_LENGTH: 180 CROP_FACE: DETECTION: DO_DYNAMIC_DETECTION: False DYNAMIC_DETECTION_FREQUENCY: 30 USE_MEDIAN_FACE_BOX: False DO_CROP_FACE: True LARGE_BOX_COEF: 1.5 USE_LARGE_FACE_BOX: True DATA_AUG: ['None'] DATA_TYPE: [''] DO_CHUNK: True LABEL_TYPE: RESIZE: H: 128 W: 128 USE_PSUEDO_PPG_LABEL: False EPOCHS: 50 LR: 0.0001 MODEL_FILE_NAME: OPTIMIZER: BETAS: (0.9, 0.999) EPS: 0.0001 MOMENTUM: 0.9

/home/kamranali/anaconda3/envs/rppg-toolbox/lib/python3.8/site-packages/torch/cuda/init.py:146: UserWarning: NVIDIA GeForce RTX 3090 with CUDA capability sm_86 is not compatible with the current PyTorch installation. The current PyTorch install supports CUDA capabilities sm_37 sm_50 sm_60 sm_61 sm_70 sm_75 compute_37. If you want to use the NVIDIA GeForce RTX 3090 GPU with PyTorch, please check the instructions at https://pytorch.org/get-started/locally/

warnings.warn(incompatible_device_warn.format(device_name, capability, " ".join(arch_list), device_name)) **** keys and size of kam_model_state_dict *** Key: motion_conv1.weight Size: torch.Size([32, 3, 3, 3]) Key: motion_conv1.bias Size: torch.Size([32]) Key: motion_conv2.weight Size: torch.Size([32, 32, 3, 3]) Key: motion_conv2.bias Size: torch.Size([32]) Key: motion_conv3.weight Size: torch.Size([64, 32, 3, 3]) Key: motion_conv3.bias Size: torch.Size([64]) Key: motion_conv4.weight Size: torch.Size([64, 64, 3, 3]) Key: motion_conv4.bias Size: torch.Size([64]) Key: apperance_conv1.weight Size: torch.Size([32, 3, 3, 3]) Key: apperance_conv1.bias Size: torch.Size([32]) Key: apperance_conv2.weight Size: torch.Size([32, 32, 3, 3]) Key: apperance_conv2.bias Size: torch.Size([32]) Key: apperance_conv3.weight Size: torch.Size([64, 32, 3, 3]) Key: apperance_conv3.bias Size: torch.Size([64]) Key: apperance_conv4.weight Size: torch.Size([64, 64, 3, 3]) Key: apperance_conv4.bias Size: torch.Size([64]) Key: apperance_att_conv1.weight Size: torch.Size([1, 32, 1, 1]) Key: apperance_att_conv1.bias Size: torch.Size([1]) Key: apperance_att_conv2.weight Size: torch.Size([1, 64, 1, 1]) Key: apperance_att_conv2.bias Size: torch.Size([1]) Key: final_dense_1.weight Size: torch.Size([128, 57600]) Key: final_dense_1.bias Size: torch.Size([128]) Key: final_dense_2.weight Size: torch.Size([1, 128]) Key: final_dense_2.bias Size: torch.Size([1])


===Testing=== **** keys and size of old state dict *** Key: module.motion_conv1.weight Size: torch.Size([32, 3, 3, 3]) Key: module.motion_conv1.bias Size: torch.Size([32]) Key: module.motion_conv2.weight Size: torch.Size([32, 32, 3, 3]) Key: module.motion_conv2.bias Size: torch.Size([32]) Key: module.motion_conv3.weight Size: torch.Size([64, 32, 3, 3]) Key: module.motion_conv3.bias Size: torch.Size([64]) Key: module.motion_conv4.weight Size: torch.Size([64, 64, 3, 3]) Key: module.motion_conv4.bias Size: torch.Size([64]) Key: module.apperance_conv1.weight Size: torch.Size([32, 3, 3, 3]) Key: module.apperance_conv1.bias Size: torch.Size([32]) Key: module.apperance_conv2.weight Size: torch.Size([32, 32, 3, 3]) Key: module.apperance_conv2.bias Size: torch.Size([32]) Key: module.apperance_conv3.weight Size: torch.Size([64, 32, 3, 3]) Key: module.apperance_conv3.bias Size: torch.Size([64]) Key: module.apperance_conv4.weight Size: torch.Size([64, 64, 3, 3]) Key: module.apperance_conv4.bias Size: torch.Size([64]) Key: module.apperance_att_conv1.weight Size: torch.Size([1, 32, 1, 1]) Key: module.apperance_att_conv1.bias Size: torch.Size([1]) Key: module.apperance_att_conv2.weight Size: torch.Size([1, 64, 1, 1]) Key: module.apperance_att_conv2.bias Size: torch.Size([1]) Key: module.final_dense_1.weight Size: torch.Size([128, 16384]) Key: module.final_dense_1.bias Size: torch.Size([128]) Key: module.final_dense_2.weight Size: torch.Size([1, 128]) Key: module.final_dense_2.bias Size: torch.Size([1])


**** keys and size of new state dict *** Key: motion_conv1.weight Size: torch.Size([32, 3, 3, 3]) Key: motion_conv1.bias Size: torch.Size([32]) Key: motion_conv2.weight Size: torch.Size([32, 32, 3, 3]) Key: motion_conv2.bias Size: torch.Size([32]) Key: motion_conv3.weight Size: torch.Size([64, 32, 3, 3]) Key: motion_conv3.bias Size: torch.Size([64]) Key: motion_conv4.weight Size: torch.Size([64, 64, 3, 3]) Key: motion_conv4.bias Size: torch.Size([64]) Key: apperance_conv1.weight Size: torch.Size([32, 3, 3, 3]) Key: apperance_conv1.bias Size: torch.Size([32]) Key: apperance_conv2.weight Size: torch.Size([32, 32, 3, 3]) Key: apperance_conv2.bias Size: torch.Size([32]) Key: apperance_conv3.weight Size: torch.Size([64, 32, 3, 3]) Key: apperance_conv3.bias Size: torch.Size([64]) Key: apperance_conv4.weight Size: torch.Size([64, 64, 3, 3]) Key: apperance_conv4.bias Size: torch.Size([64]) Key: apperance_att_conv1.weight Size: torch.Size([1, 32, 1, 1]) Key: apperance_att_conv1.bias Size: torch.Size([1]) Key: apperance_att_conv2.weight Size: torch.Size([1, 64, 1, 1]) Key: apperance_att_conv2.bias Size: torch.Size([1]) Key: final_dense_1.weight Size: torch.Size([128, 16384]) Key: final_dense_1.bias Size: torch.Size([128]) Key: final_dense_2.weight Size: torch.Size([1, 128]) Key: final_dense_2.bias Size: torch.Size([1])


Traceback (most recent call last): File "/home/kamranali/Kamran_data/Pycharm_projects/kamran_rPPG-Toolbox-main/my_test.py", line 271, in predict_vitals() File "/home/kamranali/Kamran_data/Pycharm_projects/kamran_rPPG-Toolbox-main/my_test.py", line 222, in predict_vitals kam_model.load_state_dict(new_state_dict) File "/home/kamranali/anaconda3/envs/rppg-toolbox/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1604, in load_state_dict raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format( RuntimeError: Error(s) in loading state_dict for TSCAN: size mismatch for final_dense_1.weight: copying a param with shape torch.Size([128, 16384]) from checkpoint, the shape in current model is torch.Size([128, 57600]).

Process finished with exit code 1

Changezi001 commented 1 year ago

Oh okay, I found the answer, I have to set img_size == 72: for self.final_dense_1 = nn.Linear(16384, self.nb_dense, bias=True).

girishvn commented 1 year ago

Hi @Changezi001,

Glad you found the issue. Just as a note, all of the pretrained models are trained using 72x72 inputs. Another note, rPPG research has shown some benefit using lower resolution inputs (as this acts as a form of filtration for high-freq spatial noise, and reduces model compute).

Best, Girish