Datasets - Githubissues

userLx888 commented 1 year ago

Is the datasets still being updated? Do we need to prepare the pkl files of some datasets by ourselves？Thank you.

liaorongfan commented 1 year ago

whicih pkl files?

userLx888 commented 1 year ago

datasets/stage_two/swin_frame_pred_output/statistic_train_data.pkl in train_net.py

liaorongfan commented 1 year ago

this is not a standard dataset used for benchmark, if you need, I will update it later

userLx888 commented 1 year ago

Thanks a lot.

o-ayoub commented 1 year ago

@liaorongfan @userLx888 Can you please describe how to use pkl files ? how to load annotations ? Thank you

liaorongfan commented 1 year ago

ok, Within about a month, the pkl data will be updated, currently paper revision exps are undergoing

liaorongfan commented 1 year ago

When testing, for each image there will be a label(5 traits) for it, the predicts of all images(frames) in a video and their corresponding labels rander the acc mse info.

At 2023-03-22 18:54:19, "oayoub.dev" @.***> wrote:

@liaorongfan Hi, Can you please clarify that for me.

featues: images labels : [0.6333333 0.5145631 0.6168224 0.51648355 0.4375 ] # OCEAN ....

and you are giving that to a CNN model (right?) after training and validation you got the final score [MSE, ACC] (right?)

Can you please tell how you do to calculate (MSE) for each trait (OCEAN) ? Thank you

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>

userLx888 commented 1 year ago

May I ask about animal, ghost, lego, talk_seesion in true_personality. What are the differences and connections between the four categories of session?

liaorongfan commented 1 year ago

For personality detection, indeed, I think there is no difference

userLx888 commented 1 year ago

I have just started this field and may not be familiar with many things. This code is very good and comprehensive, and I think it can be of great help to me. However, I still have many questions to ask you if I can obtain your WeChat. We can communicate directly on it. Thank you very much!

liaorongfan commented 1 year ago

I am very glad to know that you are interested in this repo. Currently, this repo is under active development and will be upgraded in the near future, so I think it's OK for us to discuss it here

userLx888 commented 1 year ago

OKOK!Thanks a lot.Due to the large amount of code, I encountered difficulties in extracting it. May I ask which parts of the code I would need to extract from this series of code as a basis for adding my own things and completing a complete project.

liaorongfan commented 1 year ago

How about running an example exp and using debug mode to follow the procedure, and then extracting the related modules. The repo is roughly organized as build-from-config, and the default config file may help you with the code running process.

userLx888 commented 1 year ago

OK！I get it!Thanks a lot.

userLx888 commented 1 year ago

Hello, I am reproducing '15multi modal_ pred.yaml'. May I ask what the input for this method is? Is the pkl of the import file the one that comes with the ChaLearn2017 dataset, or do we need to generate it ourselves. thank you!

userLx888 commented 1 year ago

![Uploading image.png…]()

RongfanLeo commented 1 year ago

Hello @userLx888

do we need to generate it ourselves

yes, we need to generate it by ourselves. The can be found in that script.

userLx888 commented 1 year ago

Hello, I am replicating 04 in config Crnet.yaml, may I ask if the order warnings of the optimizer and scheduler can be ignored. And also is the 'DeepPersonalitymain/dpcv/exps first stage/04 cr_ audiovisual_network.py' file not functioning in this project.It seems that I didn't use it when I was debugging.

liaorongfan commented 1 year ago

Hi,

"the order warnings of the optimizer and scheduler can be ignored", yes, you can just ignored it DeepPersonalitymain/dpcv/exps first stage/04 cr audiovisual_network.py I use config file to set up the training, you may have a try at "python tools/run_exp.py -c config/xxx/xxx_crnet.yaml"

liaorongfan commented 1 year ago

Hi,

"the order warnings of the optimizer and scheduler can be ignored",

yes, you can just ignored it

DeepPersonalitymain/dpcv/exps first stage/04 cr audiovisual_network.py

I use config file to set up the training, you may have a try at "python tools/run_exp.py -c config/xxx/xxx_crnet.yaml"

userLx888 commented 1 year ago

I am trying to understand the text transcription mode of learning crnet network. Can you provide me with the source code of that paper on crnet? Thank you!

liaorongfan commented 1 year ago

Hi, I am sorry to tell that this benchmark doesn't involve text modal for personality recognition since this benchmark focusing on audio-visual clues for personality recognition.

userLx888 commented 1 year ago

May I also ask if the network used for facial extraction in our experiment is MTCNN or something else？

userLx888 commented 1 year ago

Hello, I would like to ask how it is implemented to extract 32 frames of each input video as input in the code reproduced by CR net. I am sorry that I cannot see here that 32 frames were extracted as input.

liaorongfan commented 1 year ago

Hi @userLx888, glad to know that you are using the code. From my understanding, the visual model processes one image at a time, instead of processing the whole 32 frame at once. As you can see from the code, one video was downsampled into 32 frames and then a single frame was selected for model input. However, if you want to take 32 frames as input at one time, you can utilize the batch dimension to organize the input in the shape of (32, 3, 244, 244). While to my view, the 32 images will be computed in parallel and then the temporal info among the 32 images will not be captured.

userLx888 commented 1 year ago

Thank you！One video was downsampled into 32 frames, is it controlled through the sample_size variable?The sample_size in the source code is set to 100. Do I want to change it to 32 here.

liaorongfan commented 1 year ago

yes, I think so.

userLx888 commented 1 year ago

Hello, I always experience overfitting when reproducing the cr-net program. Do you have any good methods to solve it.

userLx888 commented 1 year ago

Hello, is there no subsequent ETR regression process replicated in the CR-net network？

liaorongfan commented 1 year ago

you can find it here: https://github.com/liaorongfan/DeepPersonality/blob/add_new_data_loader/dpcv/exps_second_stage/etr.py

userLx888 commented 1 year ago

Thank you！

liaorongfan commented 1 year ago

@userLx888 Hi, for overfitting, generally, drop mechanism can be used. And there is one for your reference https://arxiv.org/pdf/2004.04725.pdf

userLx888 commented 1 year ago

OK，thank you.

userLx888 commented 1 year ago

Hello, I would like to delve deeper into some of the experimental details of cr-net. Do you have the source code for that article. I cannot contact the author. If you can provide assistance, thank you very much.

liaorongfan commented 1 year ago

@userLx888 Sorry, I don't have an access to the source code from the author.

userLx888 commented 1 year ago

When I train crnet, I always encounter the problem of loss not decreasing. Have you ever encountered such problems or have you found any good solutions.

userLx888 commented 1 year ago

I used the facial cropping and alignment script provided by the code to extract facial frames from the video with poor performance, and many frames cannot be extracted. I would like to ask if there is any facial data that has already been extracted and would like to download it directly. The dataset is ChaLearn2016/2017. Thank you.

userLx888 commented 1 year ago

This is the problematic result.

liaorongfan commented 1 year ago

Can you tell me which video the images belong to? It seems not the case to me, but please let me check. I don't konw where to put the data, they are about 80G, I guess

userLx888 commented 1 year ago

之前的我手动删了好多，有很多有问题的，比如验证集的4lIbWq27O84.005。数据集不知道可否用google云盘或者百度网盘来上传。还有一个问题，我训练的损失一直不下降，但是准确率确实是在一点点提升，不知道您有没有遇到过这种问题。

userLx888 commented 1 year ago

This code comes from CR_ Net's data selection reflects how to extract 32 frames from a video. Currently, it seems that a video is divided into 32 segments, and only one random frame is extracted and sent to the network, without achieving the goal of extracting 32 frames data

liaorongfan commented 1 year ago

@userLx888 As for the dataset, I'm considering to upload it onto Google cloud for the convience of researchers. As for this piece of code, I think we've talk about it before, please referce this message

userLx888 commented 1 year ago

Okay, thank you for your answer. If I don't define this number as 32 and randomly select one frame from the entire paragraph, will the effect be the same? Does this 32 mean anything here

liaorongfan / DeepPersonality

Datasets #6