clamsproject / app-swt-detection

CLAMS app for detecting scenes with text from video input
Apache License 2.0
1 stars 0 forks source link

image-based data preprocessing #112

Closed keighrim closed 1 month ago

keighrim commented 4 months ago

New Feature Summary

At the moment, the data preprocessor expects one video file and one CSV file of manual label annotation

https://github.com/clamsproject/app-swt-detection/blob/7be4b818a0c72713e501b27be9ebaeee5a3e1320/modeling/data_loader.py#L249-L254

to prepare CNN vectors and a metadata json file

https://github.com/clamsproject/app-swt-detection/blob/7be4b818a0c72713e501b27be9ebaeee5a3e1320/modeling/data_loader.py#L240-L243

However, we are now receiving additional annotations from GBH that's done on more videos but in much sparser way. And most importantly, the video file is not a part of delivery package, but extracted frame images are.

To cope with the different data situation for next rounds of training, we need to update the data preprocessor to handle the new batches of annotations.

Additional context

Current train-ready preprocessed data looks like this;

$ ls  feature-extraction/cpb-aacip-f3fa7215348*
/llc_data/clams/swt-gbh/feature-extraction/cpb-aacip-f3fa7215348.bn_vgg16.npy
/llc_data/clams/swt-gbh/feature-extraction/cpb-aacip-f3fa7215348.bn_vgg19.npy
/llc_data/clams/swt-gbh/feature-extraction/cpb-aacip-f3fa7215348.convnext_base.npy
/llc_data/clams/swt-gbh/feature-extraction/cpb-aacip-f3fa7215348.convnext_lg.npy
/llc_data/clams/swt-gbh/feature-extraction/cpb-aacip-f3fa7215348.convnext_small.npy
/llc_data/clams/swt-gbh/feature-extraction/cpb-aacip-f3fa7215348.convnext_tiny.npy
/llc_data/clams/swt-gbh/feature-extraction/cpb-aacip-f3fa7215348.densenet121.npy
/llc_data/clams/swt-gbh/feature-extraction/cpb-aacip-f3fa7215348.efficientnet_large.npy
/llc_data/clams/swt-gbh/feature-extraction/cpb-aacip-f3fa7215348.efficientnet_med.npy
/llc_data/clams/swt-gbh/feature-extraction/cpb-aacip-f3fa7215348.efficientnet_small.npy
/llc_data/clams/swt-gbh/feature-extraction/cpb-aacip-f3fa7215348.json
/llc_data/clams/swt-gbh/feature-extraction/cpb-aacip-f3fa7215348.resnet101.npy
/llc_data/clams/swt-gbh/feature-extraction/cpb-aacip-f3fa7215348.resnet152.npy
/llc_data/clams/swt-gbh/feature-extraction/cpb-aacip-f3fa7215348.resnet18.npy
/llc_data/clams/swt-gbh/feature-extraction/cpb-aacip-f3fa7215348.resnet50.npy
/llc_data/clams/swt-gbh/feature-extraction/cpb-aacip-f3fa7215348.vgg16.npy
/llc_data/clams/swt-gbh/feature-extraction/cpb-aacip-f3fa7215348.vgg19.npy

Here's where the preprocessed data is read

https://github.com/clamsproject/app-swt-detection/blob/7be4b818a0c72713e501b27be9ebaeee5a3e1320/modeling/train.py#L129-L147

And finally, due to the sparsity of the annotation work for next batches, we need to add new GUIDs to this list

https://github.com/clamsproject/app-swt-detection/blob/7be4b818a0c72713e501b27be9ebaeee5a3e1320/modeling/gridsearch.py#L23-L24

keighrim commented 4 months ago

For the sake of implementation, let's change

 parser.add_argument("-i", "--input-video", 
                     help="filepath for the video to be processed.", 
                     required=True) 

to be ether a single video file name or a directory name with lots of image files.

keighrim commented 1 month ago

reopened by mistakenly pushing a branch under the old name.