scutcsq / DWFormer

DWFormer: Dynamic Window Transformer for Speech Emotion Recognition(ICASSP 2023 Oral)
45 stars 3 forks source link

Meld #6

Closed MF-XU closed 7 months ago

MF-XU commented 8 months ago

Dear author, your article and code are very helpful to me, and I will also cite your article in my paper later. Could you please upload the data processing and training part about meld data set? I can't find it on your GitHub. If you could upload the data processing and training part about meld data set, I would be very grateful. Perhaps leave your contact information, I will look forward to communicating with you.

scutcsq commented 7 months ago

Thanks to your attention to our work. I upload the codes of the data processing and training part of Meld dataset. Hope that could help you.

MF-XU commented 7 months ago

Dear author, thank you for your reply before, I am very excited about your reply, and it is very helpful to me. I have been trying to process meld data set recently, and according to my previous experience in handling iemocap data set, I need to process the original audio files of meld data set through wavlm pre-training model. However, there was a problem when I used the wavlm pre-trained model for processing. I used a 24G GPU to use the pre-trained model, and when I used only one audio file, 'CUDA out of memory' occurred because there was a video file in the meld dataset with a size of 7000K. I tried to use multiple Gpus for parallel processing, but this data_preprocess.py file is different from other processing methods using multiple Gpus. I haven't been able to solve this problem for the time being, do you have any good solutions? Maybe you can upload the 'test' file after 'output_repeated_splits_test' file is processed by wavlm. How did you handle it before? What was the size of the GPU you were using? If possible, I would like to add your contact information for more advice. 1705915593346

scutcsq commented 7 months ago

Dear author, thank you for your reply before, I am very excited about your reply, and it is very helpful to me. I have been trying to process meld data set recently, and according to my previous experience in handling iemocap data set, I need to process the original audio files of meld data set through wavlm pre-training model. However, there was a problem when I used the wavlm pre-trained model for processing. I used a 24G GPU to use the pre-trained model, and when I used only one audio file, 'CUDA out of memory' occurred because there was a video file in the meld dataset with a size of 7000K. I tried to use multiple Gpus for parallel processing, but this data_preprocess.py file is different from other processing methods using multiple Gpus. I haven't been able to solve this problem for the time being, do you have any good solutions? Maybe you can upload the 'test' file after 'output_repeated_splits_test' file is processed by wavlm. How did you handle it before? What was the size of the GPU you were using? If possible, I would like to add your contact information for more advice. 1705915593346

I guess that is caused by the mp4 format. You can install ffmpeg and enter the following command in the command line "ffmpeg -i xxx.mp4 -ac 1 -ar 16000 xxx.wav" to convert the mp4 file to the wav format. Using Python's subprocess library allows for batch processing.

MF-XU commented 7 months ago

Dear author, hello! According to your reply yesterday, I have successfully processed meld data set with wavlm. Thank you very much for your reply. Today, I used the meld Dataset processing file (Meld\Dataset.py) that you uploaded earlier and the following error occurred: ‘ValueError: could not broadcast input array from shape (283,225) into shape (1024,225) 'Could not broadcast input array from shape (283,225) into shape (1024,225)' Could not broadcast input array from shape (283,225) into shape (1024,225) 'Could not broadcast input array from shape (283,225) into shape (1024,225)' data1=(283,1024) newdata1=(1024,225); Then I modified the code, after which the program can run successfully, modify the following figure, modify as I did, slice the tensor higher than 225 directly, is that OK? Do you have any better solutions? 0d4f7d66fa9c65c43511c0272b63b32 5344df593fcac1dad2f784bb4c5a8cb

scutcsq commented 7 months ago

Dear author, hello! According to your reply yesterday, I have successfully processed meld data set with wavlm. Thank you very much for your reply. Today, I used the meld Dataset processing file (Meld\Dataset.py) that you uploaded earlier and the following error occurred: ‘ValueError: could not broadcast input array from shape (283,225) into shape (1024,225) 'Could not broadcast input array from shape (283,225) into shape (1024,225)' Could not broadcast input array from shape (283,225) into shape (1024,225) 'Could not broadcast input array from shape (283,225) into shape (1024,225)' data1=(283,1024) newdata1=(1024,225); Then I modified the code, after which the program can run successfully, modify the following figure, modify as I did, slice the tensor higher than 225 directly, is that OK? Do you have any better solutions? 0d4f7d66fa9c65c43511c0272b63b32 5344df593fcac1dad2f784bb4c5a8cb

Hi, I have modified the code in Dataset.py. This is because the permutation of feature dimensions is mismatch.

MF-XU commented 7 months ago

Dear author, thank you for your reply before. I am trying the code you modified, but the modified code still has the previous error in my debugging process. I feel that the error was caused by too long dimension which could not be divided and assigned, so the previous tensor was (283,1024). Now the error I encountered is that (146,1024) needs to be completed to 225, so there should be a problem with this judgment statement. I processed this according to the processing of iemocap data set, but it still cannot be realized. Do you have any good solution? I trained on the features that I extracted earlier but there was a problem, because the length of the training was different, so there were still some problems here. Looking forward to your reply! 48f8bab1057b2bf7bd192a9586ebb37

MF-XU commented 7 months ago

Dear author, I debugged your newly uploaded dateset.py and found an error in this file as follows. Thank you for your help all the time. I have another question. When I have deleted this meld dataset before, I download the original meld dataset and find a file in it is corrupted. I find in the "speechfomer" article that the author directly wipes out this file, and I also wipes out this file. I wish you a happy every day, a happy New Year, good health! 33518069770e612a0ba49447fca071f

scutcsq commented 7 months ago

Oh, I will fix the corresponding code. Thank you very much for the correction.

desertshamo commented 7 months ago

Oh, I will fix the corresponding code. Thank you very much for the correction. Dear authors and predecessors, how did the code change in the end? image

desertshamo commented 7 months ago

Dear author, I debugged your newly uploaded dateset.py and found an error in this file as follows. Thank you for your help all the time. I have another question. When I have deleted this meld dataset before, I download the original meld dataset and find a file in it is corrupted. I find in the "speechfomer" article that the author directly wipes out this file, and I also wipes out this file. I wish you a happy every day, a happy New Year, good health! 33518069770e612a0ba49447fca071f

1111 Dear predecessors, how did the code finally change to solve this problem?Can you point out any changes to the code

scutcsq commented 7 months ago

Dear author, I debugged your newly uploaded dateset.py and found an error in this file as follows. Thank you for your help all the time. I have another question. When I have deleted this meld dataset before, I download the original meld dataset and find a file in it is corrupted. I find in the "speechfomer" article that the author directly wipes out this file, and I also wipes out this file. I wish you a happy every day, a happy New Year, good health! 33518069770e612a0ba49447fca071f

1111 Dear predecessors, how did the code finally change to solve this problem?Can you point out any changes to the code

Hello, I have fixed the code. There is an error on line 70. Just change the code from "lens = data1.shape[1]" to "lens = data1.shape[0]"

desertshamo commented 7 months ago

According to the method you mentioned, I got three groups of mdp files for meld and 10 groups of mdp files for iemocap. One file is about 2G. I still have two questions. First, while training on the iemocap dataset, I noticed that "for step, (datas,labels,mask) in enumerate(trainDataset, 0):" is very slow. It takes 3 seconds for one loop. Is the data used for training an mdp file? Second: When training on the meld dataset, the problem shown in the second image appeared. Do you have similar problems? I want to confirm that it is not my environment configuration problem。Thank you very much and look forward to hearing from you  

qqqqqqq @.***

 

------------------ 原始邮件 ------------------ 发件人: @.>; 发送时间: 2024年2月3日(星期六) 上午9:33 收件人: @.>; 抄送: @.>; @.>; 主题: Re: [scutcsq/DWFormer] Meld (Issue #6)

Closed #6 as completed.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>

scutcsq commented 7 months ago

According to the method you mentioned, I got three groups of mdp files for meld and 10 groups of mdp files for iemocap. One file is about 2G. I still have two questions. First, while training on the iemocap dataset, I noticed that "for step, (datas,labels,mask) in enumerate(trainDataset, 0):" is very slow. It takes 3 seconds for one loop. Is the data used for training an mdp file? Second: When training on the meld dataset, the problem shown in the second image appeared. Do you have similar problems? I want to confirm that it is not my environment configuration problem。Thank you very much and look forward to hearing from you   qqqqqqq @.   ------------------ 原始邮件 ------------------ 发件人: @.>; 发送时间: 2024年2月3日(星期六) 上午9:33 收件人: @.>; 抄送: @.>; @.>; 主题: Re: [scutcsq/DWFormer] Meld (Issue #6) Closed #6 as completed. — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.>

Hello, I train on the iemocap dataset and it takes about 35 seconds for one epochs by using the mdb file. Do you load the file by using lmdb? As for the second picture , I could not see the image, could you please send the image again?

desertshamo commented 7 months ago

 

qqqqqqq @.***

 

------------------ 原始邮件 ------------------ 发件人: @.>; 发送时间: 2024年2月3日(星期六) 上午10:15 收件人: @.>; 抄送: @.>; @.>; 主题: Re: [scutcsq/DWFormer] Meld (Issue #6)

According to the method you mentioned, I got three groups of mdp files for meld and 10 groups of mdp files for iemocap. One file is about 2G. I still have two questions. First, while training on the iemocap dataset, I noticed that "for step, (datas,labels,mask) in enumerate(trainDataset, 0):" is very slow. It takes 3 seconds for one loop. Is the data used for training an mdp file? Second: When training on the meld dataset, the problem shown in the second image appeared. Do you have similar problems? I want to confirm that it is not my environment configuration problem。Thank you very much and look forward to hearing from you   qqqqqqq @.   … ------------------ 原始邮件 ------------------ 发件人: @.>; 发送时间: 2024年2月3日(星期六) 上午9:33 收件人: @.>; 抄送: @.>; @.>; 主题: Re: [scutcsq/DWFormer] Meld (Issue #6) Closed #6 as completed. — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.>

Hello, I train on the iemocap dataset and it takes about 35 seconds for one epochs by using the mdb file. Do you load the file by using lmdb? As for the second picture , I could not see the image, could you please send the image again?

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>

desertshamo commented 7 months ago

According to the method you mentioned, I got three groups of mdp files for meld and 10 groups of mdp files for iemocap. One file is about 2G. I still have two questions. First, while training on the iemocap dataset, I noticed that "for step, (datas,labels,mask) in enumerate(trainDataset, 0):" is very slow. It takes 3 seconds for one loop. Is the data used for training an mdp file? Second: When training on the meld dataset, the problem shown in the second image appeared. Do you have similar problems? I want to confirm that it is not my environment configuration problem。Thank you very much and look forward to hearing from you   qqqqqqq @.   ------------------ 原始邮件 ------------------ 发件人: @.**>; 发送时间: 2024年2月3日(星期六) 上午9:33 收件人: @.**>; 抄送: @.**>; @.**>; 主题: Re: [scutcsq/DWFormer] Meld (Issue #6) Closed #6 as completed. — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.>

Hello, I train on the iemocap dataset and it takes about 35 seconds for one epochs by using the mdb file. Do you load the file by using lmdb? As for the second picture , I could not see the image, could you please send the image again?

微信图片_20240203102442 微信图片_20240203102435 I used the lmdb file to load, and the data.mdb file of train1 was 5.52G. When the data was loaded, it was more than an hour before the training phase. Is that the case with you

desertshamo commented 7 months ago

According to the method you mentioned, I got three groups of mdp files for meld and 10 groups of mdp files for iemocap. One file is about 2G. I still have two questions. First, while training on the iemocap dataset, I noticed that "for step, (datas,labels,mask) in enumerate(trainDataset, 0):" is very slow. It takes 3 seconds for one loop. Is the data used for training an mdp file? Second: When training on the meld dataset, the problem shown in the second image appeared. Do you have similar problems? I want to confirm that it is not my environment configuration problem。Thank you very much and look forward to hearing from you   qqqqqqq @.   ------------------ 原始邮件 ------------------ 发件人: @.**>; 发送时间: 2024年2月3日(星期六) 上午9:33 收件人: @.**>; 抄送: @.**>; @.**>; 主题: Re: [scutcsq/DWFormer] Meld (Issue #6) Closed #6 as completed. — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.>

Hello, I train on the iemocap dataset and it takes about 35 seconds for one epochs by using the mdb file. Do you load the file by using lmdb? As for the second picture , I could not see the image, could you please send the image again?

微信图片_20240203102442 微信图片_20240203102435 I used the lmdb file to load, and the data.mdb file of train1 was 5.52G. When the data was loaded, it was more than an hour before the training phase. Is that the case with you

微信图片_20240203103254

scutcsq commented 7 months ago

According to the method you mentioned, I got three groups of mdp files for meld and 10 groups of mdp files for iemocap. One file is about 2G. I still have two questions. First, while training on the iemocap dataset, I noticed that "for step, (datas,labels,mask) in enumerate(trainDataset, 0):" is very slow. It takes 3 seconds for one loop. Is the data used for training an mdp file? Second: When training on the meld dataset, the problem shown in the second image appeared. Do you have similar problems? I want to confirm that it is not my environment configuration problem。Thank you very much and look forward to hearing from you   qqqqqqq @.   ------------------ 原始邮件 ------------------ 发件人: @.**>; 发送时间: 2024年2月3日(星期六) 上午9:33 收件人: @.**>; 抄送: @.**>; @.**>; 主题: Re: [scutcsq/DWFormer] Meld (Issue #6) Closed #6 as completed. — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.>

Hello, I train on the iemocap dataset and it takes about 35 seconds for one epochs by using the mdb file. Do you load the file by using lmdb? As for the second picture , I could not see the image, could you please send the image again?

微信图片_20240203102442 微信图片_20240203102435 I used the lmdb file to load, and the data.mdb file of train1 was 5.52G. When the data was loaded, it was more than an hour before the training phase. Is that the case with you

It is weird. Do you train the model on gpu machines?

desertshamo commented 7 months ago

According to the method you mentioned, I got three groups of mdp files for meld and 10 groups of mdp files for iemocap. One file is about 2G. I still have two questions. First, while training on the iemocap dataset, I noticed that "for step, (datas,labels,mask) in enumerate(trainDataset, 0):" is very slow. It takes 3 seconds for one loop. Is the data used for training an mdp file? Second: When training on the meld dataset, the problem shown in the second image appeared. Do you have similar problems? I want to confirm that it is not my environment configuration problem。Thank you very much and look forward to hearing from you   qqqqqqq @.   ------------------ 原始邮件 ------------------ 发件人: @.**>; 发送时间: 2024年2月3日(星期六) 上午9:33 收件人: @.**>; 抄送: @.**>; @.**>; 主题: Re: [scutcsq/DWFormer] Meld (Issue #6) Closed #6 as completed. — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.>

Hello, I train on the iemocap dataset and it takes about 35 seconds for one epochs by using the mdb file. Do you load the file by using lmdb? As for the second picture , I could not see the image, could you please send the image again?

微信图片_20240203102442 微信图片_20240203102435 I used the lmdb file to load, and the data.mdb file of train1 was 5.52G. When the data was loaded, it was more than an hour before the training phase. Is that the case with you

It is weird. Do you train the model on gpu machines?

Your mention of "GPU" reminds me. I reset the GPU. The training time is very fast. The second problem, however, is that I get "ModuleNotFoundError: No module named 'utils.vanillatransformer'" when training on the medl dataset whereas I don't get this problem when training on iemocap. My utils installation should work fine, so I wanted to ask if you had the same problem. 微信图片_20240203102435

scutcsq commented 7 months ago

You could just comment out of that line. That’s the comparison model.

发自我的iPhone

------------------ Original ------------------ From: desertshamo @.> Date: Sat,Feb 3,2024 11:07 AM To: scutcsq/DWFormer @.> Cc: scut_chen.shuaiqi @.>, State change @.> Subject: Re: [scutcsq/DWFormer] Meld (Issue #6)

According to the method you mentioned, I got three groups of mdp files for meld and 10 groups of mdp files for iemocap. One file is about 2G. I still have two questions. First, while training on the iemocap dataset, I noticed that "for step, (datas,labels,mask) in enumerate(trainDataset, 0):" is very slow. It takes 3 seconds for one loop. Is the data used for training an mdp file? Second: When training on the meld dataset, the problem shown in the second image appeared. Do you have similar problems? I want to confirm that it is not my environment configuration problem。Thank you very much and look forward to hearing from you   qqqqqqq @.*   … ------------------ 原始邮件 ------------------ 发件人: @.>; 发送时间: 2024年2月3日(星期六) 上午9:33 收件人: @**.>; 抄送: *@.>; @.>; 主题: Re: [scutcsq/DWFormer] Meld (Issue #6) Closed #6 as completed. — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @_.*>

Hello, I train on the iemocap dataset and it takes about 35 seconds for one epochs by using the mdb file. Do you load the file by using lmdb? As for the second picture , I could not see the image, could you please send the image again?

I used the lmdb file to load, and the data.mdb file of train1 was 5.52G. When the data was loaded, it was more than an hour before the training phase. Is that the case with you

It is weird. Do you train the model on gpu machines?

Your mention of "GPU" reminds me. I reset the GPU. The training time is very fast. The second problem, however, is that I get "ModuleNotFoundError: No module named 'utils.vanillatransformer'" when training on the medl dataset whereas I don't get this problem when training on iemocap. My utils installation should work fine, so I wanted to ask if you had the same problem. _20240203102435.jpg (view on web)

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you modified the open/close state.Message ID: @.***>