mlzxy / devit

CoRL 2024
https://mlzxy.github.io/devit
MIT License
347 stars 46 forks source link

demo.py use vits #32

Open hezhengting opened 11 months ago

hezhengting commented 11 months ago

I modified demo\demo.py to use vits like def main(

config_file="configs/open-vocabulary/lvis/vitl.yaml",

    #rpn_config_file="configs/RPN/mask_rcnn_R_50_FPN_1x.yaml",
    #model_path="weights/trained/open-vocabulary/lvis/vitl_0069999.pth", 
    config_file="configs/open-vocabulary/lvis/vits.yaml", 
    rpn_config_file="configs/RPN/mask_rcnn_R_50_FPN_1x.yaml",
    model_path="weights/trained/open-vocabulary/lvis/vits_0059999.pth", 

but i got error feats = roi_features.transpose(-2, -1) @ class_weights.T RuntimeError: Expected size for first two dimensions of batch2 tensor to be: [1000, 384] but got: [1000, 1024].

I guess configs/RPN/mask_rcnn_R_50_FPN_1x.yaml needs to be modified accordingly. Can you provide it? Thanks.

cyiheng commented 9 months ago

You also need to change the backbone used for building the prototypes.

So instead of model = torch.hub.load('facebookresearch/dinov2', 'dinov2_vitl14') in demo/build_prototypes.ipynb, you have to modify to dinov2_vits14

Hope it helps