zhoubenjia / MotionRGBD-PAMI

MIT License
19 stars 2 forks source link

Apply the repo on new dataset #3

Closed SadeghRahmaniB closed 7 months ago

SadeghRahmaniB commented 8 months ago

Hi,

Thank you for sharing this amazing repo. I was about to reuse it on a new action recognition dataset, and I was wondering how can I apply it to the new dataset. I have normal frames and also depth-based frames of videos.

Would you please guide me through this process and also what is the splitsdirection argument?

Bests,

zhoubenjia commented 8 months ago

Hi,

Thank you for sharing this amazing repo. I was about to reuse it on a new action recognition dataset, and I was wondering how can I apply it to the new dataset. I have normal frames and also depth-based frames of videos.

Would you please guide me through this process and also what is the splitsdirection argument?

Bests,

Hello, thanks for your attention to our works. If you want to train on your own dataset, you only need to modify the following parts: (1) Data preprocessing: You need to put all the data under a folder named my_dataset. Under my_dataset, there should be three subfolders: dataset_splits/, rgb/, and depth/. Among them, dataset_splits/ containstrain.txt and valid.txt folders, which contain N rows and 3 columns (video name, total number of frames, label) respectively (this part can refer to: https://github.com/zhoubenjia/MotionRGBD-PAMI/tree/main/data/dataset_splits). rgb/ and depth/are the frames corresponding to the video, starting with 000000.jpg. (2) Create a new file named my_dataset.yaml in the config/ folder, which contains some basic hyperparameter settings. (3) [Optional] If your dataset needs to be processed in a special way, you can create a lib/datasets/my_dataset.py to process it. This part can refer to lib/datasets/NTU.py. (4) An example of training the model with own dataset: python -m torch.distributed.launch --nproc_per_node=1 --master_port=1234 --use_env train.py --config config/my_dataset.yaml --data /mnt/data/bjzhou/codes/MotionRGBD-PAMI/my_dataset --splits /mnt/data/bjzhou/codes/MotionRGBD-PAMI/my_dataset/dataset_splits --batch-size 2 --sample-duration 32 --opt sgd --lr 0.01 --sched cosine --num-classes 2

Hope it helps

SadeghRahmaniB commented 7 months ago

Hi, Thank you for sharing this amazing repo. I was about to reuse it on a new action recognition dataset, and I was wondering how can I apply it to the new dataset. I have normal frames and also depth-based frames of videos. Would you please guide me through this process and also what is the splitsdirection argument? Bests,

Hello, thanks for your attention to our works. If you want to train on your own dataset, you only need to modify the following parts: (1) Data preprocessing: You need to put all the data under a folder named my_dataset. Under my_dataset, there should be three subfolders: dataset_splits/, rgb/, and depth/. Among them, dataset_splits/ containstrain.txt and valid.txt folders, which contain N rows and 3 columns (video name, total number of frames, label) respectively (this part can refer to: https://github.com/zhoubenjia/MotionRGBD-PAMI/tree/main/data/dataset_splits). rgb/ and depth/are the frames corresponding to the video, starting with 000000.jpg. (2) Create a new file named my_dataset.yaml in the config/ folder, which contains some basic hyperparameter settings. (3) [Optional] If your dataset needs to be processed in a special way, you can create a lib/datasets/my_dataset.py to process it. This part can refer to lib/datasets/NTU.py. (4) An example of training the model with own dataset: python -m torch.distributed.launch --nproc_per_node=1 --master_port=1234 --use_env train.py --config config/my_dataset.yaml --data /mnt/data/bjzhou/codes/MotionRGBD-PAMI/my_dataset --splits /mnt/data/bjzhou/codes/MotionRGBD-PAMI/my_dataset/dataset_splits --batch-size 2 --sample-duration 32 --opt sgd --lr 0.01 --sched cosine --num-classes 2

Hope it helps

Thank you for your thorough answer. I managed to apply my dataset to your work, and everything seems alright.

But there are some confusions about the scripts which I appreciate it if you can provide me with more information regarding them. My questions are as follows:

1_ what is the difference between fusion.py and train_fusion.py? 2_If I am wrong, please correct me. The fusion model should be the model that fuse both the RGB and Depth pre-train models. So, why you still get the parameter data_type (M or K) in those fusion sctipts?

Thank you.

zhoubenjia commented 7 months ago

Hi, Thank you for sharing this amazing repo. I was about to reuse it on a new action recognition dataset, and I was wondering how can I apply it to the new dataset. I have normal frames and also depth-based frames of videos. Would you please guide me through this process and also what is the splitsdirection argument? Bests,

Hello, thanks for your attention to our works. If you want to train on your own dataset, you only need to modify the following parts: (1) Data preprocessing: You need to put all the data under a folder named my_dataset. Under my_dataset, there should be three subfolders: dataset_splits/, rgb/, and depth/. Among them, dataset_splits/ containstrain.txt and valid.txt folders, which contain N rows and 3 columns (video name, total number of frames, label) respectively (this part can refer to: https://github.com/zhoubenjia/MotionRGBD-PAMI/tree/main/data/dataset_splits). rgb/ and depth/are the frames corresponding to the video, starting with 000000.jpg. (2) Create a new file named my_dataset.yaml in the config/ folder, which contains some basic hyperparameter settings. (3) [Optional] If your dataset needs to be processed in a special way, you can create a lib/datasets/my_dataset.py to process it. This part can refer to lib/datasets/NTU.py. (4) An example of training the model with own dataset: python -m torch.distributed.launch --nproc_per_node=1 --master_port=1234 --use_env train.py --config config/my_dataset.yaml --data /mnt/data/bjzhou/codes/MotionRGBD-PAMI/my_dataset --splits /mnt/data/bjzhou/codes/MotionRGBD-PAMI/my_dataset/dataset_splits --batch-size 2 --sample-duration 32 --opt sgd --lr 0.01 --sched cosine --num-classes 2 Hope it helps

Thank you for your thorough answer. I managed to apply my dataset to your work, and everything seems alright.

But there are some confusions about the scripts which I appreciate it if you can provide me with more information regarding them. My questions are as follows:

1_ what is the difference between fusion.py and train_fusion.py? 2_If I am wrong, please correct me. The fusion model should be the model that fuse both the RGB and Depth pre-train models. So, why you still get the parameter data_type (M or K) in those fusion sctipts?

Thank you.

I apologize for any confusion that may have arisen. To clarify, the functionality of fusion.py is the simple score fusion method, while the train_fusion.py offers the trainable fusion approach, leveraging the CFCer module. When fusing RGB-D features using train_fusion.py, you only need to set the data_type parameter to 'M.' Doing so will automatically synchronize the depth data with the RGB sample path. Hope this helpful!

SadeghRahmaniB commented 7 months ago

Thanks again. Now it all makes sense.

I am going to close this issue as complete and open another one for other question of mine.