Open niu-mc opened 9 months ago
Thank you for your interest in our work. I recommend using the features we have pre-extracted, as using glip to extract bbox, cropping images with bbox, and then extracting features of bbox with clip can be very time-consuming.
If you wish to train on your own dataset, you can implement bbox cropping (using either OpenCV or Pillow) -> clip feature extraction yourself.
Where is the bbox handling in clip?