The issue of channel mismatch.

TendernessLk commented 11 months ago

Hi， I used one of the models and loaded the weight parameters, but why does the number of channels become 0 when I reach the input? As follows.

 model = DeepPhys()
weights_name = 'PURE_DeepPhys.pth'

frame_tensor = torch.from_numpy(np.transpose(frame, (2, 0, 1))).float() / 255.0
frame_tensor = frame_tensor.unsqueeze(0) 
frame_tensor = frame_tensor.to(torch.device("cuda:0"))
print(frame_tensor.shape)

with torch.no_grad():
    output = model(frame_tensor)

torch.Size([1, 3, 72, 72]) RuntimeError: Given groups=1, weight of size [32, 3, 3, 3], expected input[1, 0, 72, 72] to have 3 channels, but got 0 channels instead

yahskapar commented 11 months ago

Hi @TendernessLk,

I'm a bit confused by your code snippet without more context - can you provide the full code so that I can have some idea as to where frame is coming from exactly and how you're even loading the model? My guess is frame is not preprocessed correctly if it's meant to be video data from a dataset like UBFC-rPPG or PURE.

Note what DeepPhys expects as inputs based on the first few lines of the forward-pass itself:

https://github.com/ubicomplab/rPPG-Toolbox/blob/ff597bf8eb55c9db961709ec8b32fb142663fd78/neural_methods/model/DeepPhys.py#L88-L89

That's effectively a sequence of video frames (composed from a batch in the toolbox), so [frames, channels, height_dimension, width_dimension]. In the above lines, one would expect 6 channels as input if the input was pre-processed correctly.

As general advice, I recommend taking a look at an example config file for DeepPhys and the below lines of code and their corresponding functions that relate to how inputs are preprocessed for supervised neural methods (such as DeepPhys):

https://github.com/ubicomplab/rPPG-Toolbox/blob/ff597bf8eb55c9db961709ec8b32fb142663fd78/dataset/data_loader/BaseLoader.py#L234-L254

If you're using the DeepPhys model in some external pipeline of yours, I strongly recommend understanding the above code first to understand what kind of input the DeepPhys model would ultimately take in, for exaple in the trainer file here.

EDIT: Also, please share the DeepPhys model file you're using if you happened to make any changes to it. If you did make changes, keep in mind that it's not reasonable to re-use a DeepPhys pre-trained model generated from the unaugmented model in the toolbox, especially if the layers were modified.

TendernessLk commented 11 months ago

Hi

I understand my issue now. I chose an RGB channel video as my input, primarily to see what the model would output in the end. However, in the end, there are only three channels. How can I have six channels? What does the 'diff_input' correspond to? Thank you for your assistance!

在 2023-09-28 19:08:52，"Akshay Paruchuri" @.***> 写道：

Hi @TendernessLk,

I'm a bit confused by your code snippet without more context - can you provide the full code so that I can have some idea as to where frame is coming from exactly and how you're even loading the model? My guess is frame is not preprocessed correctly if it's meant to be video data from a dataset like UBFC-rPPG or PURE.

Note what DeepPhys expects as inputs based on the first few lines of the forward-pass itself:

https://github.com/ubicomplab/rPPG-Toolbox/blob/ff597bf8eb55c9db961709ec8b32fb142663fd78/neural_methods/model/DeepPhys.py#L88-L89

That's effectively a sequence of video frames (composed from a batch in the toolbox), so [frames, channels, height_dimension, width_dimension]. In the above lines, one would expect 6 channels as input if the input was pre-processed correctly.

As general advice, I recommend taking a look at an example config file for DeepPhys and the below lines of code and their corresponding functions that relate to how inputs are preprocessed for supervised neural methods (such as DeepPhys):

https://github.com/ubicomplab/rPPG-Toolbox/blob/ff597bf8eb55c9db961709ec8b32fb142663fd78/dataset/data_loader/BaseLoader.py#L234-L254

If you're using the DeepPhys model in some external pipeline of yours, I strongly recommend understanding the above code first to understand what kind of input the DeepPhys model would ultimately take in, for exaple in the trainer file here.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>

yahskapar commented 11 months ago

@TendernessLk,

It'll be difficult to help you properly without more context (e.g., a full code example versus the snippet you provided earlier). As I mentioned in my previously reply, based on the information you provided it sounds like you haven't properly preprocessed the video you are trying to use in what I'm guessing is your custom code using code from this toolbox.

Take a careful look at the highlighted code in BaseLoader.py here and let me know if you have any specific questions. You will have a properly pre-processed video, with six channels, if you replicate those preprocessing steps. At that point, you will also be able to properly use the DeepPhys model, which requires those six channels since three of those (diff normalized frames) are fed into a motion model and the other three (raw frames) are fed into a appearance model that is a part of the DeepPhys architecture. You may also wish to chunk your input video, using a function such as this.

TendernessLk commented 11 months ago

Thanks for your prompt reply. I understand the issue now. Thank you again! Thank you for your work！

sincerely. K

At 2023-10-01 20:24:37, "Akshay Paruchuri" @.***> wrote:

@TendernessLk,

It'll be difficult to help you properly without more context (e.g., a full code example versus the snippet you provided earlier). As I mentioned in my previously reply, based on the information you provided it sounds like you haven't properly preprocessed the video you are trying to use in what I'm guessing is your custom code using code from this toolbox.

Take a careful look at the highlighted code in BaseLoader.pyhere and let me know if you have any specific questions. You will have a properly pre-processed video, with six channels, if you replicate those preprocessing steps. At that point, you will also be able to properly use the DeepPhys model, which requires those six channels since three of those (diff normalized frames) are fed into a motion model and the other three (raw frames) are fed into a appearance model that is a part of the DeepPhys architecture. You may also wish to chunk your input video, using a function such as this.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>

yahskapar commented 11 months ago

No problem! I'll go ahead and close this issue for the time being. Feel free to make a new one if you have any other questions or concerns with the toolbox.

ubicomplab / rPPG-Toolbox

The issue of channel mismatch. #214