ahmetgunduz / Real-time-GesRec

Real-time Hand Gesture Recognition with PyTorch on EgoGesture, NvGesture, Jester, Kinetics and UCF101
https://arxiv.org/abs/1901.10323
MIT License
619 stars 168 forks source link

How to use depth model to train color model on EgoGeture dataset? #66

Closed 11WUYU closed 4 years ago

11WUYU commented 4 years ago

Hello dear author, because I want to test the RGB image on EgoGeture dataset, I want to use the RGB model, but because there is no RGB model, I want to use the depth model to train the RGB model, but I encounter the error that the number of channels does not match. How can I solve this?

And I would like to ask the following, can detector also use depth to train? If not, how to train the RGB model of detector?

Here are the parameters I used in training, and I put opt.py No in Train and no Val changed to true, I don't know if it is modified like this, right?

!/bin/bash

python3 main.py \ --root_path ~/ \ --video_path /dataset/EGO \ --annotation_path /Real-time-GesRec/annotation_EgoGestur/egogestureall_but_None.json \ --result_path Real-time-GesRec/results \ --pretrain_path /Real-time-GesRec/models/egogesture_resnext_101_Depth_32.pth \ --dataset egogesture \ --sample_duration 32 \ --learning_rate 0.01 \ --model resnext \ --model_depth 101 \ --resnet_shortcut B \ --batch_size 1 \ --n_classes 83 \ --n_finetune_classes 83 \ --n_threads 16 \ --checkpoint 1 \ --modality Color \ --train_crop random \ --n_val_samples 1 \ --test_subset test \ --n_epochs 100 \ --no_train \ --no_val \ --test \

11WUYU commented 4 years ago

The error is: File "main.py", line 65, in weight, mode='fan_out') model, parameters = generate_model(opt) File "/home/wuyu/hand_gesture/paper/Real-time-GesRec-master/model.py", line 67, in generate_model model.load_state_dict(pretrain['state_dict']) File "/home/wuyu/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 769, in load_state_dict self.class.name, "\n\t".join(error_msgs))) RuntimeError: Error(s) in loading state_dict for DataParallel: size mismatch for module.conv1.weight: copying a param with shape torch.Size([64, 1, 7, 7, 7]) from checkpoint, the shape in current model is torch.Size([64, 3, 7, 7, 7]).

okankop commented 4 years ago

You need to inflate the number of depth channels at the initial convolutional layer kernels.

11WUYU commented 4 years ago

You need to inflate the number of depth channels at the initial convolutional layer kernels.

During the training, I put if not opt.no_train in the main function changed to if opt.no_train, and change the value of no_train in opt.py to true. I don't know whether these changes are correct or not.

ahmetgunduz commented 4 years ago

@11WUYU , you can specify the pretrain models modality by the following parameter: --pretrain_modality:

#!/bin/bash
python3 main.py
--root_path ~/
--video_path /dataset/EGO
--annotation_path /Real-time-GesRec/annotation_EgoGestur/egogestureall_but_None.json
--result_path Real-time-GesRec/results
--pretrain_path /Real-time-GesRec/models/egogesture_resnext_101_Depth_32.pth
--dataset egogesture
--sample_duration 32
--learning_rate 0.01
--model resnext
--model_depth 101
--resnet_shortcut B
--batch_size 1
--n_classes 83
--n_finetune_classes 83
--n_threads 16
--checkpoint 1
--modality RGB
--pretrain_modality Depth
--train_crop random
--n_val_samples 1
--test_subset test
--n_epochs 100
--no_train
--no_val
--test \
11WUYU commented 4 years ago

@ahmetgunduz But it said: unrecognized arguments: --pretrain_modality Depth

11WUYU commented 4 years ago

@ahmetgunduz Sorry, I used your previous code, there is no pretrain_ modality, I downloaded your current code and found that there are still some problems:

size mismatch for module.layer1.0.conv1.weight: copying a param with shape torch.Size([128, 64, 1, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 1, 1, 1]).

ahmetgunduz commented 4 years ago

can you share your bash script? Probably it is an issue with sample duration.

11WUYU commented 4 years ago

@ahmetgunduz
I am now using Jester's RGB model to train egogesture's RGB model. Do you think this is correct?

!/bin/bash

python3 main.py \ --root_path ~/ \ --video_path /dataset/EGO \ --annotation_path Real-time-GesRec-master/annotation_EgoGesture/egogestureall_but_None.json \ --result_path Real-time-GesRec-master/results \ --pretrain_path /Real-time-GesRec-master/models/jester_resnext_101_RGB_32.pth \ --dataset egogesture \ --sample_duration 32 \ --learning_rate 0.01 \ --model resnext \ --model_depth 101 \ --resnet_shortcut B \ --batch_size 1 \ --n_classes 27 \ --n_finetune_classes 83 \ --n_threads 16 \ --checkpoint 1 \ --modality RGB \ --train_crop random \ --n_val_samples 1 \ --test_subset test \ --n_epochs 100 \ --no_train \ --no_val \ --test \

  1. The main thing is that I change if not opt.no_train in to if opt.no_train in main.py .And I change the value of no_train from false to true in opt.py.I'm not sure it's the right thing to do, but it's going to get the training running.

  2. And I don't understand what's the difference between using Jester's RGB model to train the RGB model of eggestre and using eggestre's depth model to train the RGB model of eggestre, which is better?

  3. I downloaded all the models you provided. There's an egogestrure resnext 1.0x RGB 32 checkpoint.pth , is this the RGB classification model of the egogestrure?

ahmetgunduz commented 4 years ago
  1. That should be ok.
  2. You can se both, but egogesture pretraned one will be a little bit faster in converging.
  3. Yes