RylanSchaeffer / AstraFellowship-When-Do-VLM-Image-Jailbreaks-Transfer

Code for Arxiv When Do Universal Image Jailbreaks Transfer Between Vision-Language Models?
15 stars 2 forks source link

Phi-3-Based VLMs Not Usable Possibly Due to Incorrect Model Configuration #25

Open Qinyu-Allen-Zhao opened 2 months ago

Qinyu-Allen-Zhao commented 2 months ago

Hi,

Thank you for your great work!

I've been trying to use the Phi-3-Instruct-4B VLM models, but encountered several issues:

https://github.com/RylanSchaeffer/prismatic-vlms/blob/95e3097f7a3bcc7f5ac95357daccb28b33a19363/prismatic/models/backbones/llm/phi.py#L9C1-L30C10

Initially, I noticed that the PhiForCausalLM class in L27 should be Phi3ForCausalLM instead. If not corrected, this leads to a config error:

AttributeError: 'Phi3Config' object has no attribute 'partial_rotary_factor'.

Despite the above fix, the pre-trained VLM is not usable. I checked the saved checkpoint you provide (phi-instruct-3+4b+clip, and printed the keys and corresponding values.

Key: llm.model.embed_tokens.weight | Param shape: torch.Size([32064, 3072]) Key: llm.model.layers.0.self_attn.q_proj.weight | Param shape: torch.Size([3072, 3072]) Key: llm.model.layers.0.self_attn.q_proj.bias | Param shape: torch.Size([3072]) Key: llm.model.layers.0.self_attn.k_proj.weight | Param shape: torch.Size([3072, 3072]) Key: llm.model.layers.0.self_attn.k_proj.bias | Param shape: torch.Size([3072]) Key: llm.model.layers.0.self_attn.v_proj.weight | Param shape: torch.Size([3072, 3072]) Key: llm.model.layers.0.self_attn.v_proj.bias | Param shape: torch.Size([3072]) Key: llm.model.layers.0.self_attn.dense.weight | Param shape: torch.Size([3072, 3072]) Key: llm.model.layers.0.self_attn.dense.bias | Param shape: torch.Size([3072]) Key: llm.model.layers.0.mlp.fc1.weight | Param shape: torch.Size([8192, 3072]) Key: llm.model.layers.0.mlp.fc1.bias | Param shape: torch.Size([8192]) Key: llm.model.layers.0.mlp.fc2.weight | Param shape: torch.Size([3072, 8192]) Key: llm.model.layers.0.mlp.fc2.bias | Param shape: torch.Size([3072]) Key: llm.model.layers.0.input_layernorm.weight | Param shape: torch.Size([3072]) Key: llm.model.layers.0.input_layernorm.bias | Param shape: torch.Size([3072]) ....

The following are the keys and values of the checkpoint in microsoft/Phi-3-mini-4k-instruct

Key: model.embed_tokens.weight | Param shape: torch.Size([32064, 3072]) Key: model.layers.0.self_attn.o_proj.weight | Param shape: torch.Size([3072, 3072]) Key: model.layers.0.self_attn.qkv_proj.weight | Param shape: torch.Size([9216, 3072]) Key: model.layers.0.mlp.gate_up_proj.weight | Param shape: torch.Size([16384, 3072]) Key: model.layers.0.mlp.down_proj.weight | Param shape: torch.Size([3072, 8192]) Key: model.layers.0.input_layernorm.weight | Param shape: torch.Size([3072]) Key: model.layers.0.post_attention_layernorm.weight | Param shape: torch.Size([3072])

The following are the keys and values of the checkpoint in phi-2+3b

llm.model.embed_tokens.weight torch.Size([50304, 2560]) llm.model.layers.0.self_attn.q_proj.weight torch.Size([2560, 2560]) llm.model.layers.0.self_attn.q_proj.bias torch.Size([2560]) llm.model.layers.0.self_attn.k_proj.weight torch.Size([2560, 2560]) llm.model.layers.0.self_attn.k_proj.bias torch.Size([2560]) llm.model.layers.0.self_attn.v_proj.weight torch.Size([2560, 2560]) llm.model.layers.0.self_attn.v_proj.bias torch.Size([2560]) llm.model.layers.0.self_attn.dense.weight torch.Size([2560, 2560]) llm.model.layers.0.self_attn.dense.bias torch.Size([2560]) llm.model.layers.0.mlp.fc1.weight torch.Size([10240, 2560]) llm.model.layers.0.mlp.fc1.bias torch.Size([10240]) llm.model.layers.0.mlp.fc2.weight torch.Size([2560, 10240]) llm.model.layers.0.mlp.fc2.bias torch.Size([2560]) llm.model.layers.0.input_layernorm.weight torch.Size([2560]) llm.model.layers.0.input_layernorm.bias torch.Size([2560])

Upon investigating the structure of Phi2-3b, I found that the parameters match, except for the tensor sizes. This leads me to suspect that during training, the model might have used phi2 instead of phi3, but with phi3's dimensions.

I believe this discrepancy won't affect the most conclusions in the paper. But could you please look into this issue?

Thanks for your attention to this matter.

Qinyu

RylanSchaeffer commented 2 months ago

We had engineering problems with Phi 3 and decided not to spend the effort fixing it. I don't believe we included it in the paper (although please double check). The other models should work though.

Qinyu-Allen-Zhao commented 3 weeks ago

Thank you for your response!