clamsproject / aapb-annenv-swt-hitl-clustering

Annotation environment for human-in-the-loop clustering-based image labelling
0 stars 0 forks source link

running shot detection #2

Open keighrim opened 11 months ago

keighrim commented 11 months ago

As a preprocessing for #1, we'd like to run shot detection software first to eliminate duplicate images. Thread to keep track of discussion and progress on that component.

keighrim commented 11 months ago

Started to run shot detection on a larger scale, using 2 computers. From the file I generated in the linked issue, I shuffled them and sliced into 100-line smaller files. (87 smaller files)

cat ~/all_baapb_guids.txt | sort -R | split -l 100

The script has been running on two computers for about 9 hours now, and so far ~500 videos are processed. **

Due to a bug in the script (that I introduced) first 4 output files (400 videos) do not have "total length" information, but that shouldn't be a problem, we can fix them up later.

From eye-balling, I'm seeing it takes some time between 3-5 mins per video for the pyscenedetect to process.