UX-Decoder / LLaVA-Grounding

Apache License 2.0
341 stars 13 forks source link

Visual prompt `pretrain weight` does not match #14

Open felixfuu opened 8 months ago

felixfuu commented 8 months ago

in this repo, NUM_INTERACTIVE_TOKENS=3: https://github.com/UX-Decoder/LLaVA-Grounding/blob/main/configs/semsam/visual_prompt_encoder.yaml#L161 in semantic-sam: NUM_INTERACTIVE_TOKENS=6, https://github.com/UX-Decoder/Semantic-SAM/blob/main/configs/semantic_sam_only_sa-1b_swinT.yaml#L144