Baseline, hand-crafted features shot classifier

tyiannak / multimodal_movie_analysis

A Python Library for Multimodal Analysis of Movies and Content-based Movie Recommendation

25 stars 8 forks source link

Baseline, hand-crafted features shot classifier #24

Closed tyiannak closed 3 years ago

tyiannak commented 3 years ago

Description:

create a small (toy) dataset of shots (e.g. run shot_generator on the evaluation dataset)
select 3-4 classes of shots and annotate the videos to these classes (try to have a relatively balanced dataset), by splitting classes into separate folders
get the video-level features from analyze_video: for each folder (class) of video files run dir_process_video() and get feaures_all as the class's feature matrix.
form X and y matrices from the above step and run sklearn fit (e.g. svms) and evaluate

apoman38 commented 3 years ago

Steps to create the dataset: 1) Download shots and extract them 2) Create 2 folders (static, non_static) 3) Move all shots to folders (2 times one per folder) 4) Run the script create_dataset.py

PS The script creates 2 different csv files. The first contains the name of static shots that have 100 confidence and have been annotated by at least 2 annotators. The second contains the name of non-statics shots that gave 100 confidence, have been annotated by at least 2 annotators.

tyiannak commented 3 years ago

It is not productive to ask the user of create_dataset to download the data and duplicate them into 2 folders and then the script will delete the respective files from each folder. What create_dataset.py should do:

read the annotations and get aggregated annotations (from issue 20, this needs to be common and not have duplicate code).
read the data path (the path where the videos are stored)
split the data to folders (in a given output folder) based on the aggregated annotations

e.g. create_dataset('annotations.csv', '/home/ubuntu/data/all_videos', '/home/ubuntu/data/classes')

The result of this should be a set of folders-classes in '/home/ubuntu/data/classes' containing the corresponding video files , e.g. /home/ubuntu/data/classes/static /home/ubuntu/data/classes/travelling_in etc

Then you can copy the contents from particular folders to run your experiment.

tyiannak commented 3 years ago

@apoman38 checkout the latest commits that fix the bugs on loading/saving the npy feature files.

Please add the following before proceeding to PR:

more classifiers (add adaboost, extratrees, randomforests).
add classifier params (e.g. C and kernel types in SVM - must use rbf as well!)
plot overall confusion matrix at each validation and training result

apoman38 commented 3 years ago

Three more classifiers (Adaboost, Extratrees, RandomForest) were added to the train.py code. Hyperparameter tuning was performed to find the parameters that give the best results in each algorithm separately. In addition, the confusion matrix is stored as an image for each algorithm separately.