Hello, thank you for your contribution. I would like to ask you three questions.
When I use the official script to convert videos to RGB images with an FPS of 30, the speed is too slow. Is this normal? (It takes about two days to process all the videos.) And I would like to ask how much storage space the processed data occupies.
Can you list the directory structure of the training data required by Vistracker in detail?
The article does not mention the training time issue. How long does it take to train Vistracker using A100 80G?
Yes, this is normal. There are in total ~1.2k videos to be processed, so it can take long time. I usually use multiple CPUs in parallel for this kind of jobs.
I basically follows the same structure as in the original BEHAVE dataset. Specifically for training, these data are:
pre-computed object visibility file: they are packed in this file, see also the documentation
For training, I used 4 GTX8000 to train. It took around 35h to converge. If you have one A100 80G it is similar to 2GTX8000 so I guess it would take ~70hours.
Hello, thank you for your contribution. I would like to ask you three questions.