salesforce / LAVIS

LAVIS - A One-stop Library for Language-Vision Intelligence
BSD 3-Clause "New" or "Revised" License
9.79k stars 960 forks source link

how to eval VizWiz? #352

Open jorie-peng opened 1 year ago

jorie-peng commented 1 year ago

Thanks for your released model, But when I used instructBLIP vicuna7B model to eval VizWiz, the acc result is 26%, while the result in paper is 34.5%, is there any special setting? such as for "unanswerable" answer.

With same experiment setting, I can get right result in okvqa、gqa, except VizWiz.

Here is my setting:

  1. model: https://storage.googleapis.com/sfr-vision-language-research/LAVIS/models/InstructBLIP/instruct_blip_vicuna7b_trimmed.pth
  2. data: "testdev" and "val" in VizWiz, around 8k test and 4k val images in official website.
  3. eval setting

    run:
    task: vqa
    # optimization-specific
    batch_size_train: 16
    batch_size_eval: 64
    num_workers: 4
    
    # inference-specific
    max_len: 10
    min_len: 1
    num_beams: 5
    inference_method: "generate"
    prompt: "{}"
    
    seed: 42
    output_dir: "output/BLIP2/VizWiz"
    
    evaluate: True
    test_splits: ["test"]
    
    # distribution-specific
    device: "cuda"
    world_size: 1
    dist_url: "env://"
    distributed: True
lavoiems commented 1 year ago

Hi @jorie-peng,

Do you have any update on this? Have you been able to reproduce the results for VizWiz?

weiyueli7 commented 1 year ago

Hi @jorie-peng and @lavoiems . Have you guys figured out how to reproduce the results for VizWiz yet? Thanks!

jorie-peng commented 1 year ago

Hi @jorie-peng and @lavoiems . Have you guys figured out how to reproduce the results for VizWiz yet? Thanks!

haven't, maybe @LiJunnan1992 knows.

DoggyLu commented 1 year ago

hello, I also want to test the instructblip model with coco dataset ,but I don't know how to do ? I have the A100 GPU with centos, could you tell me how to config the according file?