zwq456 / CLIP-VIS

[IEEE TCSVT] Official Pytorch Implementation of CLIP-VIS: Adapting CLIP for Open-Vocabulary Video Instance Segmentation.
Apache License 2.0
35 stars 2 forks source link

Training Time #1

Closed AmingWu closed 8 months ago

AmingWu commented 8 months ago

Dear Authors,

How long does your method need to train?

SCYF123 commented 8 months ago

We trained our method with resnet50 backbone on four NVIDIA GeForce RTX 3090 GPUs for two days and nine hours.

AmingWu commented 8 months ago

Thanks. Do you perform training on the work [7] (Towards open-vocabulary video instance segmentation)? If you train, how long does it take?

SCYF123 commented 8 months ago

Sorry, we haven't trained work[7] completely.