InternLM / InternLM-XComposer

InternLM-XComposer2 is a groundbreaking vision-language large model (VLLM) excelling in free-form text-image composition and comprehension.
1.92k stars 121 forks source link

When will the proj and vit freeze? #273

Closed OpenJarvisAI closed 2 months ago

OpenJarvisAI commented 2 months ago

if training_args.fix_vit: model.vit.requiresgrad(False) else: model.vit.requiresgrad(True) model.vit.vision_tower.vision_model.post_layernorm = torch.nn.Identity( )

if training_args.fix_sampler:
    model.vision_proj.requires_grad_(False)
else:
    model.vision_proj.requires_grad_(True)

but the finetune script always False??

LightDXY commented 2 months ago

If you finetune the model on a new task with large-scale images unseen before, maybe you could set them to learnable

OpenJarvisAI commented 2 months ago

I just ask what internvx does, I want re implement it from scratch.