I modified demo\demo.py to use vits like def main(

config_file="configs/open-vocabulary/lvis/vitl.yaml",

    #rpn_config_file="configs/RPN/mask_rcnn_R_50_FPN_1x.yaml",
    #model_path="weights/trained/open-vocabulary/lvis/vitl_0069999.pth", 
    config_file="configs/open-vocabulary/lvis/vits.yaml", 
    rpn_config_file="configs/RPN/mask_rcnn_R_50_FPN_1x.yaml",
    model_path="weights/trained/open-vocabulary/lvis/vits_0059999.pth",

but i got error feats = roi_features.transpose(-2, -1) @ class_weights.T RuntimeError: Expected size for first two dimensions of batch2 tensor to be: [1000, 384] but got: [1000, 1024].

I guess configs/RPN/mask_rcnn_R_50_FPN_1x.yaml needs to be modified accordingly. Can you provide it? Thanks.

mlzxy / devit

demo.py use vits #32

config_file="configs/open-vocabulary/lvis/vitl.yaml",