zjukg / Structure-CLIP

[Paper][AAAI2024]Structure-CLIP: Towards Scene Graph Knowledge to Enhance Multi-modal Structured Representations
https://arxiv.org/abs/2305.06152
116 stars 6 forks source link

No module named 'blip_utils' #1

Closed Lotsoliu closed 10 months ago

Lotsoliu commented 10 months ago

Hi, I'm new to this task, it seems not work without blip_utils package, but I don't know where to get it.

Traceback (most recent call last): File "./model/train.py", line 7, in from utils import get_args, set_manualSeed, image_transform, WinoLoss, CLIPLoss, MarginLoss

from blip_utils import get_rank, all_gather_batch

ModuleNotFoundError: No module named 'blip_utils'

Lotsoliu commented 10 months ago

image

BigHyf commented 10 months ago

image

Thanks for your attention, we don't use CLIPLoss in utils.py, so we don't need to use from blip_utils import get_rank, all_gather_batch. In the future, I will carefully organize the code and remove the irrelevant parts of the code.

Lotsoliu commented 10 months ago

thanks a lot!

wdi-nancy commented 5 months ago

image

Thanks for your attention, we don't use CLIPLoss in utils.py, so we don't need to use from blip_utils import get_rank, all_gather_batch. In the future, I will carefully organize the code and remove the irrelevant parts of the code.

请问作者为什么不用CLIPLoss了呢?因为论文里提到:in order to maintain the general representation ability of the model, we combine the original mini-batch image-text contrastive learning loss and the proposed loss for joint training. 删掉CLIPLoss之后,如何维持general representation ability?以及有什么实验去证明这个能力,因为从表格看貌似只有在vg上做relation和attribution的下游任务结果。