Open eswar159 opened 7 months ago
I'm firstly new with working with videos
so i got thought your Repo and i felt really interested in working on this :)
i got few questions (which might be very basic but I'm really confused with this part )
- First i downloaded the full data file (dataa.zip) I see you used 2 datasets I'm mainly planning to work CMU-MOSEI so I'm mainly concentrating on it... it has 3 main folders : i) MOSEI_SPLIT : This folder has 3 files which are txt files and which have just the file names / id's (which is 100% clear) ii) MOSEI_HCF_FEATURES : this also has 3 files which are pkl files and i was able extract them and the have (vision , audio, text, label and id) id : which is just name of that video / file label : originally as per info anywhere i see only 6 labels but when i count the unique entries in this field in (Train :- 27 , Test :- 23, Valid:- 25 unique entries and what are the corresponding numbers?) my major confusion is how are 6 labels mapped to this many? text : I see each string has been made to a 50 X 300 exactly which preprocessing method has been used here audio : I see each audio had been made to 500 X 74 exactly which preprocessing method has been used here vision : I see each image has been made to 500 X 35 exactly which preprocessing method has been used here and when talking about vision part each short video has multiple images how is been handled / which image are used exactly iii) MOSEI_RAW_PROCESSED : This folder is clear as it has just the videos and short videos and broken into audio files and images
_My main question here is if i want replicate some work just like you can i use only MOSEI_HCFFEATURES folder (as it has train test validation)?
Hello, I would like to train this model on my own dataset. But I am not sure about the details of the dataset format. May I ask if you can provide me with a small example?
I'm firstly new with working with videos
so i got thought your Repo and i felt really interested in working on this :)
i got few questions (which might be very basic but I'm really confused with this part )
- First i downloaded the full data file (dataa.zip) I see you used 2 datasets I'm mainly planning to work CMU-MOSEI so I'm mainly concentrating on it... it has 3 main folders : i) MOSEI_SPLIT : This folder has 3 files which are txt files and which have just the file names / id's (which is 100% clear) ii) MOSEI_HCF_FEATURES : this also has 3 files which are pkl files and i was able extract them and the have (vision , audio, text, label and id) id : which is just name of that video / file label : originally as per info anywhere i see only 6 labels but when i count the unique entries in this field in (Train :- 27 , Test :- 23, Valid:- 25 unique entries and what are the corresponding numbers?) my major confusion is how are 6 labels mapped to this many? text : I see each string has been made to a 50 X 300 exactly which preprocessing method has been used here audio : I see each audio had been made to 500 X 74 exactly which preprocessing method has been used here vision : I see each image has been made to 500 X 35 exactly which preprocessing method has been used here and when talking about vision part each short video has multiple images how is been handled / which image are used exactly iii) MOSEI_RAW_PROCESSED : This folder is clear as it has just the videos and short videos and broken into audio files and images
_My main question here is if i want replicate some work just like you can i use only MOSEI_HCFFEATURES folder (as it has train test validation)?
First of all, thank you very much for your interest in our study. For your five questions, we have the following answers:
I'm firstly new with working with videos
so i got thought your Repo and i felt really interested in working on this :)
i got few questions (which might be very basic but I'm really confused with this part )
1) First i downloaded the full data file (dataa.zip) I see you used 2 datasets I'm mainly planning to work CMU-MOSEI so I'm mainly concentrating on it... it has 3 main folders : i) MOSEI_SPLIT : This folder has 3 files which are txt files and which have just the file names / id's (which is 100% clear) ii) MOSEI_HCF_FEATURES : this also has 3 files which are pkl files and i was able extract them and the have (vision , audio, text, label and id) id : which is just name of that video / file label : originally as per info anywhere i see only 6 labels but when i count the unique entries in this field in (Train :- 27 , Test :- 23, Valid:- 25 unique entries and what are the corresponding numbers?) my major confusion is how are 6 labels mapped to this many? text : I see each string has been made to a 50 X 300 exactly which preprocessing method has been used here audio : I see each audio had been made to 500 X 74 exactly which preprocessing method has been used here vision : I see each image has been made to 500 X 35 exactly which preprocessing method has been used here and when talking about vision part each short video has multiple images how is been handled / which image are used exactly iii) MOSEI_RAW_PROCESSED : This folder is clear as it has just the videos and short videos and broken into audio files and images
My main question here is if i want replicate some work just like you can i use only MOSEI_HCF_FEATURES folder (as it has train test validation)?