Closed omkar-kumbhar closed 2 years ago
Yeah I need to redo the CMU-MOSEI processing script because the total number of clips in the label info is different to the total number of clips overall (see below). The clips are also numbered in the transcripts but not the labels file, such that process.py
doesn't match them properly.
I'll rewrite process.py
to use the pre-segmented video clips instead and match these with the appropriate interval from CMU_MOSEI_Labels.csd
.
Full videos: 3837
Full videos in CMU_MOSEI_Labels.csd
: 3293
Full videos in predefined train/valid/test folds: 2769 (2 of which do not have labels)
Existing video segments: 39627
Segment intervals in transcripts: 44977
Segment intervals in CMU_MOSEI_Labels.csd
: 23259
@agkphysics Hello, your work is excellent, but I have a little doubt, using the changed processing script you provided, slicing the audio according to the transcription and then labelling it with emotion labels, the final total is 30,174, but the official presentation has a total of 23,453, I'm a little confused, I hope you can help me, thanks in advance!
@gt950 I think there are some discrepancies between the labels file and their presentation. As I mentioned, there are only 23259 intervals in CMU_MOSEI_Labels.csd
. I don't know where the other 200 come from. Also I'm not sure why you get 30,174 files with labels.
@agkphysics Thank you very much for your answer! Sorry, it was a mistake on my part, because you redesigned {processing.py} according to the slice labels in the transcription and got 30174 slices in the {label.csv} file with the new script you provided, but the number of slices with labels in it is still 23259. Also, I found that in the emotion labels of ['features'] there would be several types of emotion with the same evaluation value, which caused the agrmax() function to return only the index of the first maximum value, thus resulting in audio slices with emotions of 'happy', 'sad ', 'anger', 'disgust', 'surprise', ' fear' are 14567, 3783, 2730, 1291, 437, 452 respectively, which also differs from the official distribution of the number of labels provided, is this due to this issue in {CMU_MOSEI_labels.csd}? I wonder if you have noticed this issue. Thank you again!
@gt950 No this is deliberate. I wanted to get a single emotion label for the purpose of multiclass emotion classification. The code should give two labels files, label_maj.csv
for values that are a majority of the total, and label_plu.csv
for values that are only a plurality.
Although I just found a bug in which it was doing integer division and incorrectly assigning the majority label. Fixed in 78639c6
@agkphysics Thank you so much for your reply and help!!
I wanted to get the sentiment labels for the Raw transcripts. For example, A transcript file inside CMU_MOSEI at: /Raw/Transcript/Segmented/Combined/_0efYOjQYRc.txt looks like this:
where if you split them with '____', you should get video id, and split id, which I've assumed to be the same as intervals.
In order to tag it I used the process method but I'm unsure about it's correctness. The sentiment labels in the feature column goes from -3 to +3, with an increment of 0.33. Can you see if the code snippet I've used would work?
The code I used was from: https://github.com/Strong-AI-Lab/emotion/blob/21c02f1e8ce96796cf3e9281e8aa0461fe3c7479/datasets/CMU-MOSEI/process.py#L41-L70
I modified the process method, to get sentiment.