Closed kxxseola closed 1 year ago
Thanks for reporting this! I think it's due to the "best_score" not being saved in the pretrained checkpoint. 1e0264a8860f7abadfee8858470073d8dfc0d6c8 should fix this by initializing it to 0 if it doesn't exist. The actual score of the pretrained model shouldn't really matter if you're finetuning, since it will likely be worse at iteration 0 than after a few rounds of finetuning anyway.
Can you let me know if it works?
Thank you for your quick response! I just experimented with it right now because I couldn't get an A100 instance. And I'm facing this error.
Traceback (most recent call last):
File ".../fromage/main.py", line 642, in <module>
main(sys.argv[1:])
File ".../fromage/main.py", line 197, in main
mp.spawn(main_worker, nprocs=ngpus_per_node, args=(ngpus_per_node, args))
File ".../anaconda3/envs/fromage/lib/python3.10/site-packages/torch/multiprocessing/spawn.py", line 239, in spawn
return start_processes(fn, args, nprocs, join, daemon, start_method='spawn')
File ".../anaconda3/envs/fromage/lib/python3.10/site-packages/torch/multiprocessing/spawn.py", line 197, in start_processes
while not context.join():
File ".../anaconda3/envs/fromage/lib/python3.10/site-packages/torch/multiprocessing/spawn.py", line 160, in join
raise ProcessRaisedException(msg, error_index, failed_process.pid)
torch.multiprocessing.spawn.ProcessRaisedException:
-- Process 0 terminated with the following error:
Traceback (most recent call last):
File ".../anaconda3/envs/fromage/lib/python3.10/site-packages/torch/multiprocessing/spawn.py", line 69, in _wrap
fn(i, *args)
File ".../fromage/main.py", line 325, in main_worker
best_score = best_score.to(args.gpu)
AttributeError: 'int' object has no attribute 'to'
Ah, this is what happens when you don't test on a GPU...This line should be obsolete, because best_score doesn't need to be a tensor. 3d9bb8a49c947d8db6820484c888d8c90e7dfc97 should fix this, I think.
Then I got CUDA error about A100. So I tried below.
$export CUDA_LAUNCH_BLOCKING=1
$pip3 uninstall torch torchvision torchaudio
$pip3 install torch torchvision torchaudio
And It says there's missing keys.
-- Process 0 terminated with the following error:
Traceback (most recent call last):
File ".../anaconda3/envs/fromage_2/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 69, in _wrap
fn(i, *args)
File ".../fromage/main.py", line 327, in main_worker
model.load_state_dict(checkpoint['state_dict'])
File ".../anaconda3/envs/fromage_2/lib/python3.8/site-packages/torch/nn/modules/module.py", line 2041, in load_state_dict
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for DistributedDataParallel:
Missing key(s) in state_dict: "module.model.logit_scale", "module.model.lm.model.decoder.embed_tokens.weight", "module.model.lm.model.decoder.embed_positions.weight", "module.model.lm.model.decoder.final_layer_norm.weight", "module.model.lm.model.decoder.final_layer_norm.bias", "module.model.lm.model.decoder.layers.0.self_attn.k_proj.weight", "module.model.lm.model.decoder.layers.0.self_attn.k_proj.bias", "module.model.lm.model.decoder.layers.0.self_attn.v_proj.weight", "module.model.lm.model.decoder.layers.0.self_attn.v_proj.bias", "module.model.lm.model.decoder.layers.0.self_attn.q_proj.weight", "module.model.lm.model.decoder.layers.0.self_attn.q_proj.bias", "module.model.lm.model.decoder.layers.0.self_attn.out_proj.weight", "module.model.lm.model.decoder.layers.0.self_attn.out_proj.bias", "module.model.lm.model.decoder.layers.0.self_attn_layer_norm.weight", "module.model.lm.model.decoder.layers.0.self_attn_layer_norm.bias", "module.model.lm.model.decoder.layers.0.fc1.weight", "module.model.lm.model.decoder.layers.0.fc1.bias", "module.model.lm.model.decoder.layers.0.fc2.weight", "module.model.lm.model.decoder.layers.0.fc2.bias", "module.model.lm.model.decoder.layers.0.final_layer_norm.weight", "module.model.lm.model.decoder.layers.0.final_layer_norm.bias", "module.model.lm.model.decoder.layers.1.self_attn.k_proj.weight", "module.model.lm.model.decoder.layers.1.self_attn.k_proj.bias", "module.model.lm.model.decoder.layers.1.self_attn.v_proj.weight", "module.model.lm.model.decoder.layers.1.self_attn.v_proj.bias", "module.model.lm.model.decoder.layers.1.self_attn.q_proj.weight", "module.model.lm.model.decoder.layers.1.self_attn.q_proj.bias", "module.model.lm.model.decoder.layers.1.self_attn.out_proj.weight", "module.model.lm.model.decoder.layers.1.self_attn.out_proj.bias", "module.model.lm.model.decoder.layers.1.self_attn_layer_norm.weight", "module.model.lm.model.decoder.layers.1.self_attn_layer_norm.bias", "module.model.lm.model.decoder.layers.1.fc1.weight", "module.model.lm.model.decoder.layers.1.fc1.bias", "module.model.lm.model.decoder.layers.1.fc2.weight", "module.model.lm.model.decoder.layers.1.fc2.bias", "module.model.lm.model.decoder.layers.1.final_layer_norm.weight", "module.model.lm.model.decoder.layers.1.final_layer_norm.bias", "module.model.lm.model.decoder.layers.2.self_attn.k_proj.weight", "module.model.lm.model.decoder.layers.2.self_attn.k_proj.bias", "module.model.lm.model.decoder.layers.2.self_attn.v_proj.weight", "module.model.lm.model.decoder.layers.2.self_attn.v_proj.bias", "module.model.lm.model.decoder.layers.2.self_attn.q_proj.weight", "module.model.lm.model.decoder.layers.2.self_attn.q_proj.bias", "module.model.lm.model.decoder.layers.2.self_attn.out_proj.weight", "module.model.lm.model.decoder.layers.2.self_attn.out_proj.bias", "module.model.lm.model.decoder.layers.2.self_attn_layer_norm.weight", "module.model.lm.model.decoder.layers.2.self_attn_layer_norm.bias", "module.model.lm.model.decoder.layers.2.fc1.weight", "module.model.lm.model.decoder.layers.2.fc1.bias", "module.model.lm.model.decoder.layers.2.fc2.weight", "module.model.lm.model.decoder.layers.2.fc2.bias", "module.model.lm.model.decoder.layers.2.final_layer_norm.weight", "module.model.lm.model.decoder.layers.2.final_layer_norm.bias", "module.model.lm.model.decoder.layers.3.self_attn.k_proj.weight", "module.model.lm.model.decoder.layers.3.self_attn.k_proj.bias", "module.model.lm.model.decoder.layers.3.self_attn.v_proj.weight", "module.model.lm.model.decoder.layers.3.self_attn.v_proj.bias", "module.model.lm.model.decoder.layers.3.self_attn.q_proj.weight", "module.model.lm.model.decoder.layers.3.self_attn.q_proj.bias", "module.model.lm.model.decoder.layers.3.self_attn.out_proj.weight", "module.model.lm.model.decoder.layers.3.self_attn.out_proj.bias", "module.model.lm.model.decoder.layers.3.self_attn_layer_norm.weight", "module.model.lm.model.decoder.layers.3.self_attn_layer_norm.bias", "module.model.lm.model.decoder.layers.3.fc1.weight", "module.model.lm.model.decoder.layers.3.fc1.bias", "module.model.lm.model.decoder.layers.3.fc2.weight", "module.model.lm.model.decoder.layers.3.fc2.bias", "module.model.lm.model.decoder.layers.3.final_layer_norm.weight", "module.model.lm.model.decoder.layers.3.final_layer_norm.bias", "module.model.lm.model.decoder.layers.4.self_attn.k_proj.weight", "module.model.lm.model.decoder.layers.4.self_attn.k_proj.bias", "module.model.lm.model.decoder.layers.4.self_attn.v_proj.weight", "module.model.lm.model.decoder.layers.4.self_attn.v_proj.bias", "module.model.lm.model.decoder.layers.4.self_attn.q_proj.weight", "module.model.lm.model.decoder.layers.4.self_attn.q_proj.bias", "module.model.lm.model.decoder.layers.4.self_attn.out_proj.weight", "module.model.lm.model.decoder.layers.4.self_attn.out_proj.bias", "module.model.lm.model.decoder.layers.4.self_attn_layer_norm.weight", "module.model.lm.model.decoder.layers.4.self_attn_layer_norm.bias", "module.model.lm.model.decoder.layers.4.fc1.weight", "module.model.lm.model.decoder.layers.4.fc1.bias", "module.model.lm.model.decoder.layers.4.fc2.weight", "module.model.lm.model.decoder.layers.4.fc2.bias", "module.model.lm.model.decoder.layers.4.final_layer_norm.weight", "module.model.lm.model.decoder.layers.4.final_layer_norm.bias", "module.model.lm.model.decoder.layers.5.self_attn.k_proj.weight", "module.model.lm.model.decoder.layers.5.self_attn.k_proj.bias", "module.model.lm.model.decoder.layers.5.self_attn.v_proj.weight", "module.model.lm.model.decoder.layers.5.self_attn.v_proj.bias", "module.model.lm.model.decoder.layers.5.self_attn.q_proj.weight", "module.model.lm.model.decoder.layers.5.self_attn.q_proj.bias", "module.model.lm.model.decoder.layers.5.self_attn.out_proj.weight", "module.model.lm.model.decoder.layers.5.self_attn.out_proj.bias", "module.model.lm.model.decoder.layers.5.self_attn_layer_norm.weight", "module.model.lm.model.decoder.layers.5.self_attn_layer_norm.bias", "module.model.lm.model.decoder.layers.5.fc1.weight", "module.model.lm.model.decoder.layers.5.fc1.bias", "module.model.lm.model.decoder.layers.5.fc2.weight", "module.model.lm.model.decoder.layers.5.fc2.bias", "module.model.lm.model.decoder.layers.5.final_layer_norm.weight", "module.model.lm.model.decoder.layers.5.final_layer_norm.bias", "module.model.lm.model.decoder.layers.6.self_attn.k_proj.weight", "module.model.lm.model.decoder.layers.6.self_attn.k_proj.bias", "module.model.lm.model.decoder.layers.6.self_attn.v_proj.weight", "module.model.lm.model.decoder.layers.6.self_attn.v_proj.bias", "module.model.lm.model.decoder.layers.6.self_attn.q_proj.weight", "module.model.lm.model.decoder.layers.6.self_attn.q_proj.bias", "module.model.lm.model.decoder.layers.6.self_attn.out_proj.weight", "module.model.lm.model.decoder.layers.6.self_attn.out_proj.bias", "module.model.lm.model.decoder.layers.6.self_attn_layer_norm.weight", "module.model.lm.model.decoder.layers.6.self_attn_layer_norm.bias", "module.model.lm.model.decoder.layers.6.fc1.weight", "module.model.lm.model.decoder.layers.6.fc1.bias", "module.model.lm.model.decoder.layers.6.fc2.weight", "module.model.lm.model.decoder.layers.6.fc2.bias", "module.model.lm.model.decoder.layers.6.final_layer_norm.weight", "module.model.lm.model.decoder.layers.6.final_layer_norm.bias", "module.model.lm.model.decoder.layers.7.self_attn.k_proj.weight", "module.model.lm.model.decoder.layers.7.self_attn.k_proj.bias", "module.model.lm.model.decoder.layers.7.self_attn.v_proj.weight", "module.model.lm.model.decoder.layers.7.self_attn.v_proj.bias", "module.model.lm.model.decoder.layers.7.self_attn.q_proj.weight", "module.model.lm.model.decoder.layers.7.self_attn.q_proj.bias", "module.model.lm.model.decoder.layers.7.self_attn.out_proj.weight", "module.model.lm.model.decoder.layers.7.self_attn.out_proj.bias", "module.model.lm.model.decoder.layers.7.self_attn_layer_norm.weight", "module.model.lm.model.decoder.layers.7.self_attn_layer_norm.bias", "module.model.lm.model.decoder.layers.7.fc1.weight", "module.model.lm.model.decoder.layers.7.fc1.bias", "module.model.lm.model.decoder.layers.7.fc2.weight", "module.model.lm.model.decoder.layers.7.fc2.bias", "module.model.lm.model.decoder.layers.7.final_layer_norm.weight", "module.model.lm.model.decoder.layers.7.final_layer_norm.bias", "module.model.lm.model.decoder.layers.8.self_attn.k_proj.weight", "module.model.lm.model.decoder.layers.8.self_attn.k_proj.bias", "module.model.lm.model.decoder.layers.8.self_attn.v_proj.weight", "module.model.lm.model.decoder.layers.8.self_attn.v_proj.bias", "module.model.lm.model.decoder.layers.8.self_attn.q_proj.weight", "module.model.lm.model.decoder.layers.8.self_attn.q_proj.bias", "module.model.lm.model.decoder.layers.8.self_attn.out_proj.weight", "module.model.lm.model.decoder.layers.8.self_attn.out_proj.bias", "module.model.lm.model.decoder.layers.8.self_attn_layer_norm.weight", "module.model.lm.model.decoder.layers.8.self_attn_layer_norm.bias", "module.model.lm.model.decoder.layers.8.fc1.weight", "module.model.lm.model.decoder.layers.8.fc1.bias", "module.model.lm.model.decoder.layers.8.fc2.weight", "module.model.lm.model.decoder.layers.8.fc2.bias", "module.model.lm.model.decoder.layers.8.final_layer_norm.weight", "module.model.lm.model.decoder.layers.8.final_layer_norm.bias", "module.model.lm.model.decoder.layers.9.self_attn.k_proj.weight", "module.model.lm.model.decoder.layers.9.self_attn.k_proj.bias", "module.model.lm.model.decoder.layers.9.self_attn.v_proj.weight", "module.model.lm.model.decoder.layers.9.self_attn.v_proj.bias", "module.model.lm.model.decoder.layers.9.self_attn.q_proj.weight", "module.model.lm.model.decoder.layers.9.self_attn.q_proj.bias", "module.model.lm.model.decoder.layers.9.self_attn.out_proj.weight", "module.model.lm.model.decoder.layers.9.self_attn.out_proj.bias", "module.model.lm.model.decoder.layers.9.self_attn_layer_norm.weight", "module.model.lm.model.decoder.layers.9.self_attn_layer_norm.bias", "module.model.lm.model.decoder.layers.9.fc1.weight", "module.model.lm.model.decoder.layers.9.fc1.bias", "module.model.lm.model.decoder.layers.9.fc2.weight", "module.model.lm.model.decoder.layers.9.fc2.bias", "module.model.lm.model.decoder.layers.9.final_layer_norm.weight", "module.model.lm.model.decoder.layers.9.final_layer_norm.bias", "module.model.lm.model.decoder.layers.10.self_attn.k_proj.weight", "module.model.lm.model.decoder.layers.10.self_attn.k_proj.bias", "module.model.lm.model.decoder.layers.10.self_attn.v_proj.weight", "module.model.lm.model.decoder.layers.10.self_attn.v_proj.bias", "module.model.lm.model.decoder.layers.10.self_attn.q_proj.weight", "module.model.lm.model.decoder.layers.10.self_attn.q_proj.bias", "module.model.lm.model.decoder.layers.10.self_attn.out_proj.weight", "module.model.lm.model.decoder.layers.10.self_attn.out_proj.bias", "module.model.lm.model.decoder.layers.10.self_attn_layer_norm.weight", "module.model.lm.model.decoder.layers.10.self_attn_layer_norm.bias", "module.model.lm.model.decoder.layers.10.fc1.weight", "module.model.lm.model.decoder.layers.10.fc1.bias", "module.model.lm.model.decoder.layers.10.fc2.weight", "module.model.lm.model.decoder.layers.10.fc2.bias", "module.model.lm.model.decoder.layers.10.final_layer_norm.weight", "module.model.lm.model.decoder.layers.10.final_layer_norm.bias", "module.model.lm.model.decoder.layers.11.self_attn.k_proj.weight", "module.model.lm.model.decoder.layers.11.self_attn.k_proj.bias", "module.model.lm.model.decoder.layers.11.self_attn.v_proj.weight", "module.model.lm.model.decoder.layers.11.self_attn.v_proj.bias", "module.model.lm.model.decoder.layers.11.self_attn.q_proj.weight", "module.model.lm.model.decoder.layers.11.self_attn.q_proj.bias", "module.model.lm.model.decoder.layers.11.self_attn.out_proj.weight", "module.model.lm.model.decoder.layers.11.self_attn.out_proj.bias", "module.model.lm.model.decoder.layers.11.self_attn_layer_norm.weight", "module.model.lm.model.decoder.layers.11.self_attn_layer_norm.bias", "module.model.lm.model.decoder.layers.11.fc1.weight", "module.model.lm.model.decoder.layers.11.fc1.bias", "module.model.lm.model.decoder.layers.11.fc2.weight", "module.model.lm.model.decoder.layers.11.fc2.bias", "module.model.lm.model.decoder.layers.11.final_layer_norm.weight", "module.model.lm.model.decoder.layers.11.final_layer_norm.bias", "module.model.lm.model.decoder.layers.12.self_attn.k_proj.weight", "module.model.lm.model.decoder.layers.12.self_attn.k_proj.bias", "module.model.lm.model.decoder.layers.12.self_attn.v_proj.weight", "module.model.lm.model.decoder.layers.12.self_attn.v_proj.bias", "module.model.lm.model.decoder.layers.12.self_attn.q_proj.weight", "module.model.lm.model.decoder.layers.12.self_attn.q_proj.bias", "module.model.lm.model.decoder.layers.12.self_attn.out_proj.weight", "module.model.lm.model.decoder.layers.12.self_attn.out_proj.bias", "module.model.lm.model.decoder.layers.12.self_attn_layer_norm.weight", "module.model.lm.model.decoder.layers.12.self_attn_layer_norm.bias", "module.model.lm.model.decoder.layers.12.fc1.weight", "module.model.lm.model.decoder.layers.12.fc1.bias", "module.model.lm.model.decoder.layers.12.fc2.weight", "module.model.lm.model.decoder.layers.12.fc2.bias", "module.model.lm.model.decoder.layers.12.final_layer_norm.weight", "module.model.lm.model.decoder.layers.12.final_layer_norm.bias", "module.model.lm.model.decoder.layers.13.self_attn.k_proj.weight", "module.model.lm.model.decoder.layers.13.self_attn.k_proj.bias", "module.model.lm.model.decoder.layers.13.self_attn.v_proj.weight", "module.model.lm.model.decoder.layers.13.self_attn.v_proj.bias", "module.model.lm.model.decoder.layers.13.self_attn.q_proj.weight", "module.model.lm.model.decoder.layers.13.self_attn.q_proj.bias", "module.model.lm.model.decoder.layers.13.self_attn.out_proj.weight", "module.model.lm.model.decoder.layers.13.self_attn.out_proj.bias", "module.model.lm.model.decoder.layers.13.self_attn_layer_norm.weight", "module.model.lm.model.decoder.layers.13.self_attn_layer_norm.bias", "module.model.lm.model.decoder.layers.13.fc1.weight", "module.model.lm.model.decoder.layers.13.fc1.bias", "module.model.lm.model.decoder.layers.13.fc2.weight", "module.model.lm.model.decoder.layers.13.fc2.bias", "module.model.lm.model.decoder.layers.13.final_layer_norm.weight", "module.model.lm.model.decoder.layers.13.final_layer_norm.bias", "module.model.lm.model.decoder.layers.14.self_attn.k_proj.weight", "module.model.lm.model.decoder.layers.14.self_attn.k_proj.bias", "module.model.lm.model.decoder.layers.14.self_attn.v_proj.weight", "module.model.lm.model.decoder.layers.14.self_attn.v_proj.bias", "module.model.lm.model.decoder.layers.14.self_attn.q_proj.weight", "module.model.lm.model.decoder.layers.14.self_attn.q_proj.bias", "module.model.lm.model.decoder.layers.14.self_attn.out_proj.weight", "module.model.lm.model.decoder.layers.14.self_attn.out_proj.bias", "module.model.lm.model.decoder.layers.14.self_attn_layer_norm.weight", "module.model.lm.model.decoder.layers.14.self_attn_layer_norm.bias", "module.model.lm.model.decoder.layers.14.fc1.weight", "module.model.lm.model.decoder.layers.14.fc1.bias", "module.model.lm.model.decoder.layers.14.fc2.weight", "module.model.lm.model.decoder.layers.14.fc2.bias", "module.model.lm.model.decoder.layers.14.final_layer_norm.weight", "module.model.lm.model.decoder.layers.14.final_layer_norm.bias", "module.model.lm.model.decoder.layers.15.self_attn.k_proj.weight", "module.model.lm.model.decoder.layers.15.self_attn.k_proj.bias", "module.model.lm.model.decoder.layers.15.self_attn.v_proj.weight", "module.model.lm.model.decoder.layers.15.self_attn.v_proj.bias", "module.model.lm.model.decoder.layers.15.self_attn.q_proj.weight", "module.model.lm.model.decoder.layers.15.self_attn.q_proj.bias", "module.model.lm.model.decoder.layers.15.self_attn.out_proj.weight", "module.model.lm.model.decoder.layers.15.self_attn.out_proj.bias", "module.model.lm.model.decoder.layers.15.self_attn_layer_norm.weight", "module.model.lm.model.decoder.layers.15.self_attn_layer_norm.bias", "module.model.lm.model.decoder.layers.15.fc1.weight", "module.model.lm.model.decoder.layers.15.fc1.bias", "module.model.lm.model.decoder.layers.15.fc2.weight", "module.model.lm.model.decoder.layers.15.fc2.bias", "module.model.lm.model.decoder.layers.15.final_layer_norm.weight", "module.model.lm.model.decoder.layers.15.final_layer_norm.bias", "module.model.lm.model.decoder.layers.16.self_attn.k_proj.weight", "module.model.lm.model.decoder.layers.16.self_attn.k_proj.bias", "module.model.lm.model.decoder.layers.16.self_attn.v_proj.weight", "module.model.lm.model.decoder.layers.16.self_attn.v_proj.bias", "module.model.lm.model.decoder.layers.16.self_attn.q_proj.weight", "module.model.lm.model.decoder.layers.16.self_attn.q_proj.bias", "module.model.lm.model.decoder.layers.16.self_attn.out_proj.weight", "module.model.lm.model.decoder.layers.16.self_attn.out_proj.bias", "module.model.lm.model.decoder.layers.16.self_attn_layer_norm.weight", "module.model.lm.model.decoder.layers.16.self_attn_layer_norm.bias", "module.model.lm.model.decoder.layers.16.fc1.weight", "module.model.lm.model.decoder.layers.16.fc1.bias", "module.model.lm.model.decoder.layers.16.fc2.weight", "module.model.lm.model.decoder.layers.16.fc2.bias", "module.model.lm.model.decoder.layers.16.final_layer_norm.weight", "module.model.lm.model.decoder.layers.16.final_layer_norm.bias", "module.model.lm.model.decoder.layers.17.self_attn.k_proj.weight", "module.model.lm.model.decoder.layers.17.self_attn.k_proj.bias", "module.model.lm.model.decoder.layers.17.self_attn.v_proj.weight", "module.model.lm.model.decoder.layers.17.self_attn.v_proj.bias", "module.model.lm.model.decoder.layers.17.self_attn.q_proj.weight", "module.model.lm.model.decoder.layers.17.self_attn.q_proj.bias", "module.model.lm.model.decoder.layers.17.self_attn.out_proj.weight", "module.model.lm.model.decoder.layers.17.self_attn.out_proj.bias", "module.model.lm.model.decoder.layers.17.self_attn_layer_norm.weight", "module.model.lm.model.decoder.layers.17.self_attn_layer_norm.bias", "module.model.lm.model.decoder.layers.17.fc1.weight", "module.model.lm.model.decoder.layers.17.fc1.bias", "module.model.lm.model.decoder.layers.17.fc2.weight", "module.model.lm.model.decoder.layers.17.fc2.bias", "module.model.lm.model.decoder.layers.17.final_layer_norm.weight", "module.model.lm.model.decoder.layers.17.final_layer_norm.bias", "module.model.lm.model.decoder.layers.18.self_attn.k_proj.weight", "module.model.lm.model.decoder.layers.18.self_attn.k_proj.bias", "module.model.lm.model.decoder.layers.18.self_attn.v_proj.weight", "module.model.lm.model.decoder.layers.18.self_attn.v_proj.bias", "module.model.lm.model.decoder.layers.18.self_attn.q_proj.weight", "module.model.lm.model.decoder.layers.18.self_attn.q_proj.bias", "module.model.lm.model.decoder.layers.18.self_attn.out_proj.weight", "module.model.lm.model.decoder.layers.18.self_attn.out_proj.bias", "module.model.lm.model.decoder.layers.18.self_attn_layer_norm.weight", "module.model.lm.model.decoder.layers.18.self_attn_layer_norm.bias", "module.model.lm.model.decoder.layers.18.fc1.weight", "module.model.lm.model.decoder.layers.18.fc1.bias", "module.model.lm.model.decoder.layers.18.fc2.weight", "module.model.lm.model.decoder.layers.18.fc2.bias", "module.model.lm.model.decoder.layers.18.final_layer_norm.weight", "module.model.lm.model.decoder.layers.18.final_layer_norm.bias", "module.model.lm.model.decoder.layers.19.self_attn.k_proj.weight", "module.model.lm.model.decoder.layers.19.self_attn.k_proj.bias", "module.model.lm.model.decoder.layers.19.self_attn.v_proj.weight", "module.model.lm.model.decoder.layers.19.self_attn.v_proj.bias", "module.model.lm.model.decoder.layers.19.self_attn.q_proj.weight", "module.model.lm.model.decoder.layers.19.self_attn.q_proj.bias", "module.model.lm.model.decoder.layers.19.self_attn.out_proj.weight", "module.model.lm.model.decoder.layers.19.self_attn.out_proj.bias", "module.model.lm.model.decoder.layers.19.self_attn_layer_norm.weight", "module.model.lm.model.decoder.layers.19.self_attn_layer_norm.bias", "module.model.lm.model.decoder.layers.19.fc1.weight", "module.model.lm.model.decoder.layers.19.fc1.bias", "module.model.lm.model.decoder.layers.19.fc2.weight", "module.model.lm.model.decoder.layers.19.fc2.bias", "module.model.lm.model.decoder.layers.19.final_layer_norm.weight", "module.model.lm.model.decoder.layers.19.final_layer_norm.bias", "module.model.lm.model.decoder.layers.20.self_attn.k_proj.weight", "module.model.lm.model.decoder.layers.20.self_attn.k_proj.bias", "module.model.lm.model.decoder.layers.20.self_attn.v_proj.weight", "module.model.lm.model.decoder.layers.20.self_attn.v_proj.bias", "module.model.lm.model.decoder.layers.20.self_attn.q_proj.weight", "module.model.lm.model.decoder.layers.20.self_attn.q_proj.bias", "module.model.lm.model.decoder.layers.20.self_attn.out_proj.weight", "module.model.lm.model.decoder.layers.20.self_attn.out_proj.bias", "module.model.lm.model.decoder.layers.20.self_attn_layer_norm.weight", "module.model.lm.model.decoder.layers.20.self_attn_layer_norm.bias", "module.model.lm.model.decoder.layers.20.fc1.weight", "module.model.lm.model.decoder.layers.20.fc1.bias", "module.model.lm.model.decoder.layers.20.fc2.weight", "module.model.lm.model.decoder.layers.20.fc2.bias", "module.model.lm.model.decoder.layers.20.final_layer_norm.weight", "module.model.lm.model.decoder.layers.20.final_layer_norm.bias", "module.model.lm.model.decoder.layers.21.self_attn.k_proj.weight", "module.model.lm.model.decoder.layers.21.self_attn.k_proj.bias", "module.model.lm.model.decoder.layers.21.self_attn.v_proj.weight", "module.model.lm.model.decoder.layers.21.self_attn.v_proj.bias", "module.model.lm.model.decoder.layers.21.self_attn.q_proj.weight", "module.model.lm.model.decoder.layers.21.self_attn.q_proj.bias", "module.model.lm.model.decoder.layers.21.self_attn.out_proj.weight", "module.model.lm.model.decoder.layers.21.self_attn.out_proj.bias", "module.model.lm.model.decoder.layers.21.self_attn_layer_norm.weight", "module.model.lm.model.decoder.layers.21.self_attn_layer_norm.bias", "module.model.lm.model.decoder.layers.21.fc1.weight", "module.model.lm.model.decoder.layers.21.fc1.bias", "module.model.lm.model.decoder.layers.21.fc2.weight", "module.model.lm.model.decoder.layers.21.fc2.bias", "module.model.lm.model.decoder.layers.21.final_layer_norm.weight", "module.model.lm.model.decoder.layers.21.final_layer_norm.bias", "module.model.lm.model.decoder.layers.22.self_attn.k_proj.weight", "module.model.lm.model.decoder.layers.22.self_attn.k_proj.bias", "module.model.lm.model.decoder.layers.22.self_attn.v_proj.weight", "module.model.lm.model.decoder.layers.22.self_attn.v_proj.bias", "module.model.lm.model.decoder.layers.22.self_attn.q_proj.weight", "module.model.lm.model.decoder.layers.22.self_attn.q_proj.bias", "module.model.lm.model.decoder.layers.22.self_attn.out_proj.weight", "module.model.lm.model.decoder.layers.22.self_attn.out_proj.bias", "module.model.lm.model.decoder.layers.22.self_attn_layer_norm.weight", "module.model.lm.model.decoder.layers.22.self_attn_layer_norm.bias", "module.model.lm.model.decoder.layers.22.fc1.weight", "module.model.lm.model.decoder.layers.22.fc1.bias", "module.model.lm.model.decoder.layers.22.fc2.weight", "module.model.lm.model.decoder.layers.22.fc2.bias", "module.model.lm.model.decoder.layers.22.final_layer_norm.weight", "module.model.lm.model.decoder.layers.22.final_layer_norm.bias", "module.model.lm.model.decoder.layers.23.self_attn.k_proj.weight", "module.model.lm.model.decoder.layers.23.self_attn.k_proj.bias", "module.model.lm.model.decoder.layers.23.self_attn.v_proj.weight", "module.model.lm.model.decoder.layers.23.self_attn.v_proj.bias", "module.model.lm.model.decoder.layers.23.self_attn.q_proj.weight", "module.model.lm.model.decoder.layers.23.self_attn.q_proj.bias", "module.model.lm.model.decoder.layers.23.self_attn.out_proj.weight", "module.model.lm.model.decoder.layers.23.self_attn.out_proj.bias", "module.model.lm.model.decoder.layers.23.self_attn_layer_norm.weight", "module.model.lm.model.decoder.layers.23.self_attn_layer_norm.bias", "module.model.lm.model.decoder.layers.23.fc1.weight", "module.model.lm.model.decoder.layers.23.fc1.bias", "module.model.lm.model.decoder.layers.23.fc2.weight", "module.model.lm.model.decoder.layers.23.fc2.bias", "module.model.lm.model.decoder.layers.23.final_layer_norm.weight", "module.model.lm.model.decoder.layers.23.final_layer_norm.bias", "module.model.lm.model.decoder.layers.24.self_attn.k_proj.weight", "module.model.lm.model.decoder.layers.24.self_attn.k_proj.bias", "module.model.lm.model.decoder.layers.24.self_attn.v_proj.weight", "module.model.lm.model.decoder.layers.24.self_attn.v_proj.bias", "module.model.lm.model.decoder.layers.24.self_attn.q_proj.weight", "module.model.lm.model.decoder.layers.24.self_attn.q_proj.bias", "module.model.lm.model.decoder.layers.24.self_attn.out_proj.weight", "module.model.lm.model.decoder.layers.24.self_attn.out_proj.bias", "module.model.lm.model.decoder.layers.24.self_attn_layer_norm.weight", "module.model.lm.model.decoder.layers.24.self_attn_layer_norm.bias", "module.model.lm.model.decoder.layers.24.fc1.weight", "module.model.lm.model.decoder.layers.24.fc1.bias", "module.model.lm.model.decoder.layers.24.fc2.weight", "module.model.lm.model.decoder.layers.24.fc2.bias", "module.model.lm.model.decoder.layers.24.final_layer_norm.weight", "module.model.lm.model.decoder.layers.24.final_layer_norm.bias", "module.model.lm.model.decoder.layers.25.self_attn.k_proj.weight", "module.model.lm.model.decoder.layers.25.self_attn.k_proj.bias", "module.model.lm.model.decoder.layers.25.self_attn.v_proj.weight", "module.model.lm.model.decoder.layers.25.self_attn.v_proj.bias", "module.model.lm.model.decoder.layers.25.self_attn.q_proj.weight", "module.model.lm.model.decoder.layers.25.self_attn.q_proj.bias", "module.model.lm.model.decoder.layers.25.self_attn.out_proj.weight", "module.model.lm.model.decoder.layers.25.self_attn.out_proj.bias", "module.model.lm.model.decoder.layers.25.self_attn_layer_norm.weight", "module.model.lm.model.decoder.layers.25.self_attn_layer_norm.bias", "module.model.lm.model.decoder.layers.25.fc1.weight", "module.model.lm.model.decoder.layers.25.fc1.bias", "module.model.lm.model.decoder.layers.25.fc2.weight", "module.model.lm.model.decoder.layers.25.fc2.bias", "module.model.lm.model.decoder.layers.25.final_layer_norm.weight", "module.model.lm.model.decoder.layers.25.final_layer_norm.bias", "module.model.lm.model.decoder.layers.26.self_attn.k_proj.weight", "module.model.lm.model.decoder.layers.26.self_attn.k_proj.bias", "module.model.lm.model.decoder.layers.26.self_attn.v_proj.weight", "module.model.lm.model.decoder.layers.26.self_attn.v_proj.bias", "module.model.lm.model.decoder.layers.26.self_attn.q_proj.weight", "module.model.lm.model.decoder.layers.26.self_attn.q_proj.bias", "module.model.lm.model.decoder.layers.26.self_attn.out_proj.weight", "module.model.lm.model.decoder.layers.26.self_attn.out_proj.bias", "module.model.lm.model.decoder.layers.26.self_attn_layer_norm.weight", "module.model.lm.model.decoder.layers.26.self_attn_layer_norm.bias", "module.model.lm.model.decoder.layers.26.fc1.weight", "module.model.lm.model.decoder.layers.26.fc1.bias", "module.model.lm.model.decoder.layers.26.fc2.weight", "module.model.lm.model.decoder.layers.26.fc2.bias", "module.model.lm.model.decoder.layers.26.final_layer_norm.weight", "module.model.lm.model.decoder.layers.26.final_layer_norm.bias", "module.model.lm.model.decoder.layers.27.self_attn.k_proj.weight", "module.model.lm.model.decoder.layers.27.self_attn.k_proj.bias", "module.model.lm.model.decoder.layers.27.self_attn.v_proj.weight", "module.model.lm.model.decoder.layers.27.self_attn.v_proj.bias", "module.model.lm.model.decoder.layers.27.self_attn.q_proj.weight", "module.model.lm.model.decoder.layers.27.self_attn.q_proj.bias", "module.model.lm.model.decoder.layers.27.self_attn.out_proj.weight", "module.model.lm.model.decoder.layers.27.self_attn.out_proj.bias", "module.model.lm.model.decoder.layers.27.self_attn_layer_norm.weight", "module.model.lm.model.decoder.layers.27.self_attn_layer_norm.bias", "module.model.lm.model.decoder.layers.27.fc1.weight", "module.model.lm.model.decoder.layers.27.fc1.bias", "module.model.lm.model.decoder.layers.27.fc2.weight", "module.model.lm.model.decoder.layers.27.fc2.bias", "module.model.lm.model.decoder.layers.27.final_layer_norm.weight", "module.model.lm.model.decoder.layers.27.final_layer_norm.bias", "module.model.lm.model.decoder.layers.28.self_attn.k_proj.weight", "module.model.lm.model.decoder.layers.28.self_attn.k_proj.bias", "module.model.lm.model.decoder.layers.28.self_attn.v_proj.weight", "module.model.lm.model.decoder.layers.28.self_attn.v_proj.bias", "module.model.lm.model.decoder.layers.28.self_attn.q_proj.weight", "module.model.lm.model.decoder.layers.28.self_attn.q_proj.bias", "module.model.lm.model.decoder.layers.28.self_attn.out_proj.weight", "module.model.lm.model.decoder.layers.28.self_attn.out_proj.bias", "module.model.lm.model.decoder.layers.28.self_attn_layer_norm.weight", "module.model.lm.model.decoder.layers.28.self_attn_layer_norm.bias", "module.model.lm.model.decoder.layers.28.fc1.weight", "module.model.lm.model.decoder.layers.28.fc1.bias", "module.model.lm.model.decoder.layers.28.fc2.weight", "module.model.lm.model.decoder.layers.28.fc2.bias", "module.model.lm.model.decoder.layers.28.final_layer_norm.weight", "module.model.lm.model.decoder.layers.28.final_layer_norm.bias", "module.model.lm.model.decoder.layers.29.self_attn.k_proj.weight", "module.model.lm.model.decoder.layers.29.self_attn.k_proj.bias", "module.model.lm.model.decoder.layers.29.self_attn.v_proj.weight", "module.model.lm.model.decoder.layers.29.self_attn.v_proj.bias", "module.model.lm.model.decoder.layers.29.self_attn.q_proj.weight", "module.model.lm.model.decoder.layers.29.self_attn.q_proj.bias", "module.model.lm.model.decoder.layers.29.self_attn.out_proj.weight", "module.model.lm.model.decoder.layers.29.self_attn.out_proj.bias", "module.model.lm.model.decoder.layers.29.self_attn_layer_norm.weight", "module.model.lm.model.decoder.layers.29.self_attn_layer_norm.bias", "module.model.lm.model.decoder.layers.29.fc1.weight", "module.model.lm.model.decoder.layers.29.fc1.bias", "module.model.lm.model.decoder.layers.29.fc2.weight", "module.model.lm.model.decoder.layers.29.fc2.bias", "module.model.lm.model.decoder.layers.29.final_layer_norm.weight", "module.model.lm.model.decoder.layers.29.final_layer_norm.bias", "module.model.lm.model.decoder.layers.30.self_attn.k_proj.weight", "module.model.lm.model.decoder.layers.30.self_attn.k_proj.bias", "module.model.lm.model.decoder.layers.30.self_attn.v_proj.weight", "module.model.lm.model.decoder.layers.30.self_attn.v_proj.bias", "module.model.lm.model.decoder.layers.30.self_attn.q_proj.weight", "module.model.lm.model.decoder.layers.30.self_attn.q_proj.bias", "module.model.lm.model.decoder.layers.30.self_attn.out_proj.weight", "module.model.lm.model.decoder.layers.30.self_attn.out_proj.bias", "module.model.lm.model.decoder.layers.30.self_attn_layer_norm.weight", "module.model.lm.model.decoder.layers.30.self_attn_layer_norm.bias", "module.model.lm.model.decoder.layers.30.fc1.weight", "module.model.lm.model.decoder.layers.30.fc1.bias", "module.model.lm.model.decoder.layers.30.fc2.weight", "module.model.lm.model.decoder.layers.30.fc2.bias", "module.model.lm.model.decoder.layers.30.final_layer_norm.weight", "module.model.lm.model.decoder.layers.30.final_layer_norm.bias", "module.model.lm.model.decoder.layers.31.self_attn.k_proj.weight", "module.model.lm.model.decoder.layers.31.self_attn.k_proj.bias", "module.model.lm.model.decoder.layers.31.self_attn.v_proj.weight", "module.model.lm.model.decoder.layers.31.self_attn.v_proj.bias", "module.model.lm.model.decoder.layers.31.self_attn.q_proj.weight", "module.model.lm.model.decoder.layers.31.self_attn.q_proj.bias", "module.model.lm.model.decoder.layers.31.self_attn.out_proj.weight", "module.model.lm.model.decoder.layers.31.self_attn.out_proj.bias", "module.model.lm.model.decoder.layers.31.self_attn_layer_norm.weight", "module.model.lm.model.decoder.layers.31.self_attn_layer_norm.bias", "module.model.lm.model.decoder.layers.31.fc1.weight", "module.model.lm.model.decoder.layers.31.fc1.bias", "module.model.lm.model.decoder.layers.31.fc2.weight", "module.model.lm.model.decoder.layers.31.fc2.bias", "module.model.lm.model.decoder.layers.31.final_layer_norm.weight", "module.model.lm.model.decoder.layers.31.final_layer_norm.bias", "module.model.lm.lm_head.weight", "module.model.input_embeddings.weight", "module.model.visual_model.vision_model.embeddings.class_embedding", "module.model.visual_model.vision_model.embeddings.position_ids", "module.model.visual_model.vision_model.embeddings.patch_embedding.weight", "module.model.visual_model.vision_model.embeddings.position_embedding.weight", "module.model.visual_model.vision_model.pre_layrnorm.weight", "module.model.visual_model.vision_model.pre_layrnorm.bias", "module.model.visual_model.vision_model.encoder.layers.0.self_attn.k_proj.weight", "module.model.visual_model.vision_model.encoder.layers.0.self_attn.k_proj.bias", "module.model.visual_model.vision_model.encoder.layers.0.self_attn.v_proj.weight", "module.model.visual_model.vision_model.encoder.layers.0.self_attn.v_proj.bias", "module.model.visual_model.vision_model.encoder.layers.0.self_attn.q_proj.weight", "module.model.visual_model.vision_model.encoder.layers.0.self_attn.q_proj.bias", "module.model.visual_model.vision_model.encoder.layers.0.self_attn.out_proj.weight", "module.model.visual_model.vision_model.encoder.layers.0.self_attn.out_proj.bias", "module.model.visual_model.vision_model.encoder.layers.0.layer_norm1.weight", "module.model.visual_model.vision_model.encoder.layers.0.layer_norm1.bias", "module.model.visual_model.vision_model.encoder.layers.0.mlp.fc1.weight", "module.model.visual_model.vision_model.encoder.layers.0.mlp.fc1.bias", "module.model.visual_model.vision_model.encoder.layers.0.mlp.fc2.weight", "module.model.visual_model.vision_model.encoder.layers.0.mlp.fc2.bias", "module.model.visual_model.vision_model.encoder.layers.0.layer_norm2.weight", "module.model.visual_model.vision_model.encoder.layers.0.layer_norm2.bias", "module.model.visual_model.vision_model.encoder.layers.1.self_attn.k_proj.weight", "module.model.visual_model.vision_model.encoder.layers.1.self_attn.k_proj.bias", "module.model.visual_model.vision_model.encoder.layers.1.self_attn.v_proj.weight", "module.model.visual_model.vision_model.encoder.layers.1.self_attn.v_proj.bias", "module.model.visual_model.vision_model.encoder.layers.1.self_attn.q_proj.weight", "module.model.visual_model.vision_model.encoder.layers.1.self_attn.q_proj.bias", "module.model.visual_model.vision_model.encoder.layers.1.self_attn.out_proj.weight", "module.model.visual_model.vision_model.encoder.layers.1.self_attn.out_proj.bias", "module.model.visual_model.vision_model.encoder.layers.1.layer_norm1.weight", "module.model.visual_model.vision_model.encoder.layers.1.layer_norm1.bias", "module.model.visual_model.vision_model.encoder.layers.1.mlp.fc1.weight", "module.model.visual_model.vision_model.encoder.layers.1.mlp.fc1.bias", "module.model.visual_model.vision_model.encoder.layers.1.mlp.fc2.weight", "module.model.visual_model.vision_model.encoder.layers.1.mlp.fc2.bias", "module.model.visual_model.vision_model.encoder.layers.1.layer_norm2.weight", "module.model.visual_model.vision_model.encoder.layers.1.layer_norm2.bias", "module.model.visual_model.vision_model.encoder.layers.2.self_attn.k_proj.weight", "module.model.visual_model.vision_model.encoder.layers.2.self_attn.k_proj.bias", "module.model.visual_model.vision_model.encoder.layers.2.self_attn.v_proj.weight", "module.model.visual_model.vision_model.encoder.layers.2.self_attn.v_proj.bias", "module.model.visual_model.vision_model.encoder.layers.2.self_attn.q_proj.weight", "module.model.visual_model.vision_model.encoder.layers.2.self_attn.q_proj.bias", "module.model.visual_model.vision_model.encoder.layers.2.self_attn.out_proj.weight", "module.model.visual_model.vision_model.encoder.layers.2.self_attn.out_proj.bias", "module.model.visual_model.vision_model.encoder.layers.2.layer_norm1.weight", "module.model.visual_model.vision_model.encoder.layers.2.layer_norm1.bias", "module.model.visual_model.vision_model.encoder.layers.2.mlp.fc1.weight", "module.model.visual_model.vision_model.encoder.layers.2.mlp.fc1.bias", "module.model.visual_model.vision_model.encoder.layers.2.mlp.fc2.weight", "module.model.visual_model.vision_model.encoder.layers.2.mlp.fc2.bias", "module.model.visual_model.vision_model.encoder.layers.2.layer_norm2.weight", "module.model.visual_model.vision_model.encoder.layers.2.layer_norm2.bias", "module.model.visual_model.vision_model.encoder.layers.3.self_attn.k_proj.weight", "module.model.visual_model.vision_model.encoder.layers.3.self_attn.k_proj.bias", "module.model.visual_model.vision_model.encoder.layers.3.self_attn.v_proj.weight", "module.model.visual_model.vision_model.encoder.layers.3.self_attn.v_proj.bias", "module.model.visual_model.vision_model.encoder.layers.3.self_attn.q_proj.weight", "module.model.visual_model.vision_model.encoder.layers.3.self_attn.q_proj.bias", "module.model.visual_model.vision_model.encoder.layers.3.self_attn.out_proj.weight", "module.model.visual_model.vision_model.encoder.layers.3.self_attn.out_proj.bias", "module.model.visual_model.vision_model.encoder.layers.3.layer_norm1.weight", "module.model.visual_model.vision_model.encoder.layers.3.layer_norm1.bias", "module.model.visual_model.vision_model.encoder.layers.3.mlp.fc1.weight", "module.model.visual_model.vision_model.encoder.layers.3.mlp.fc1.bias", "module.model.visual_model.vision_model.encoder.layers.3.mlp.fc2.weight", "module.model.visual_model.vision_model.encoder.layers.3.mlp.fc2.bias", "module.model.visual_model.vision_model.encoder.layers.3.layer_norm2.weight", "module.model.visual_model.vision_model.encoder.layers.3.layer_norm2.bias", "module.model.visual_model.vision_model.encoder.layers.4.self_attn.k_proj.weight", "module.model.visual_model.vision_model.encoder.layers.4.self_attn.k_proj.bias", "module.model.visual_model.vision_model.encoder.layers.4.self_attn.v_proj.weight", "module.model.visual_model.vision_model.encoder.layers.4.self_attn.v_proj.bias", "module.model.visual_model.vision_model.encoder.layers.4.self_attn.q_proj.weight", "module.model.visual_model.vision_model.encoder.layers.4.self_attn.q_proj.bias", "module.model.visual_model.vision_model.encoder.layers.4.self_attn.out_proj.weight", "module.model.visual_model.vision_model.encoder.layers.4.self_attn.out_proj.bias", "module.model.visual_model.vision_model.encoder.layers.4.layer_norm1.weight", "module.model.visual_model.vision_model.encoder.layers.4.layer_norm1.bias", "module.model.visual_model.vision_model.encoder.layers.4.mlp.fc1.weight", "module.model.visual_model.vision_model.encoder.layers.4.mlp.fc1.bias", "module.model.visual_model.vision_model.encoder.layers.4.mlp.fc2.weight", "module.model.visual_model.vision_model.encoder.layers.4.mlp.fc2.bias", "module.model.visual_model.vision_model.encoder.layers.4.layer_norm2.weight", "module.model.visual_model.vision_model.encoder.layers.4.layer_norm2.bias", "module.model.visual_model.vision_model.encoder.layers.5.self_attn.k_proj.weight", "module.model.visual_model.vision_model.encoder.layers.5.self_attn.k_proj.bias", "module.model.visual_model.vision_model.encoder.layers.5.self_attn.v_proj.weight", "module.model.visual_model.vision_model.encoder.layers.5.self_attn.v_proj.bias", "module.model.visual_model.vision_model.encoder.layers.5.self_attn.q_proj.weight", "module.model.visual_model.vision_model.encoder.layers.5.self_attn.q_proj.bias", "module.model.visual_model.vision_model.encoder.layers.5.self_attn.out_proj.weight", "module.model.visual_model.vision_model.encoder.layers.5.self_attn.out_proj.bias", "module.model.visual_model.vision_model.encoder.layers.5.layer_norm1.weight", "module.model.visual_model.vision_model.encoder.layers.5.layer_norm1.bias", "module.model.visual_model.vision_model.encoder.layers.5.mlp.fc1.weight", "module.model.visual_model.vision_model.encoder.layers.5.mlp.fc1.bias", "module.model.visual_model.vision_model.encoder.layers.5.mlp.fc2.weight", "module.model.visual_model.vision_model.encoder.layers.5.mlp.fc2.bias", "module.model.visual_model.vision_model.encoder.layers.5.layer_norm2.weight", "module.model.visual_model.vision_model.encoder.layers.5.layer_norm2.bias", "module.model.visual_model.vision_model.encoder.layers.6.self_attn.k_proj.weight", "module.model.visual_model.vision_model.encoder.layers.6.self_attn.k_proj.bias", "module.model.visual_model.vision_model.encoder.layers.6.self_attn.v_proj.weight", "module.model.visual_model.vision_model.encoder.layers.6.self_attn.v_proj.bias", "module.model.visual_model.vision_model.encoder.layers.6.self_attn.q_proj.weight", "module.model.visual_model.vision_model.encoder.layers.6.self_attn.q_proj.bias", "module.model.visual_model.vision_model.encoder.layers.6.self_attn.out_proj.weight", "module.model.visual_model.vision_model.encoder.layers.6.self_attn.out_proj.bias", "module.model.visual_model.vision_model.encoder.layers.6.layer_norm1.weight", "module.model.visual_model.vision_model.encoder.layers.6.layer_norm1.bias", "module.model.visual_model.vision_model.encoder.layers.6.mlp.fc1.weight", "module.model.visual_model.vision_model.encoder.layers.6.mlp.fc1.bias", "module.model.visual_model.vision_model.encoder.layers.6.mlp.fc2.weight", "module.model.visual_model.vision_model.encoder.layers.6.mlp.fc2.bias", "module.model.visual_model.vision_model.encoder.layers.6.layer_norm2.weight", "module.model.visual_model.vision_model.encoder.layers.6.layer_norm2.bias", "module.model.visual_model.vision_model.encoder.layers.7.self_attn.k_proj.weight", "module.model.visual_model.vision_model.encoder.layers.7.self_attn.k_proj.bias", "module.model.visual_model.vision_model.encoder.layers.7.self_attn.v_proj.weight", "module.model.visual_model.vision_model.encoder.layers.7.self_attn.v_proj.bias", "module.model.visual_model.vision_model.encoder.layers.7.self_attn.q_proj.weight", "module.model.visual_model.vision_model.encoder.layers.7.self_attn.q_proj.bias", "module.model.visual_model.vision_model.encoder.layers.7.self_attn.out_proj.weight", "module.model.visual_model.vision_model.encoder.layers.7.self_attn.out_proj.bias", "module.model.visual_model.vision_model.encoder.layers.7.layer_norm1.weight", "module.model.visual_model.vision_model.encoder.layers.7.layer_norm1.bias", "module.model.visual_model.vision_model.encoder.layers.7.mlp.fc1.weight", "module.model.visual_model.vision_model.encoder.layers.7.mlp.fc1.bias", "module.model.visual_model.vision_model.encoder.layers.7.mlp.fc2.weight", "module.model.visual_model.vision_model.encoder.layers.7.mlp.fc2.bias", "module.model.visual_model.vision_model.encoder.layers.7.layer_norm2.weight", "module.model.visual_model.vision_model.encoder.layers.7.layer_norm2.bias", "module.model.visual_model.vision_model.encoder.layers.8.self_attn.k_proj.weight", "module.model.visual_model.vision_model.encoder.layers.8.self_attn.k_proj.bias", "module.model.visual_model.vision_model.encoder.layers.8.self_attn.v_proj.weight", "module.model.visual_model.vision_model.encoder.layers.8.self_attn.v_proj.bias", "module.model.visual_model.vision_model.encoder.layers.8.self_attn.q_proj.weight", "module.model.visual_model.vision_model.encoder.layers.8.self_attn.q_proj.bias", "module.model.visual_model.vision_model.encoder.layers.8.self_attn.out_proj.weight", "module.model.visual_model.vision_model.encoder.layers.8.self_attn.out_proj.bias", "module.model.visual_model.vision_model.encoder.layers.8.layer_norm1.weight", "module.model.visual_model.vision_model.encoder.layers.8.layer_norm1.bias", "module.model.visual_model.vision_model.encoder.layers.8.mlp.fc1.weight", "module.model.visual_model.vision_model.encoder.layers.8.mlp.fc1.bias", "module.model.visual_model.vision_model.encoder.layers.8.mlp.fc2.weight", "module.model.visual_model.vision_model.encoder.layers.8.mlp.fc2.bias", "module.model.visual_model.vision_model.encoder.layers.8.layer_norm2.weight", "module.model.visual_model.vision_model.encoder.layers.8.layer_norm2.bias", "module.model.visual_model.vision_model.encoder.layers.9.self_attn.k_proj.weight", "module.model.visual_model.vision_model.encoder.layers.9.self_attn.k_proj.bias", "module.model.visual_model.vision_model.encoder.layers.9.self_attn.v_proj.weight", "module.model.visual_model.vision_model.encoder.layers.9.self_attn.v_proj.bias", "module.model.visual_model.vision_model.encoder.layers.9.self_attn.q_proj.weight", "module.model.visual_model.vision_model.encoder.layers.9.self_attn.q_proj.bias", "module.model.visual_model.vision_model.encoder.layers.9.self_attn.out_proj.weight", "module.model.visual_model.vision_model.encoder.layers.9.self_attn.out_proj.bias", "module.model.visual_model.vision_model.encoder.layers.9.layer_norm1.weight", "module.model.visual_model.vision_model.encoder.layers.9.layer_norm1.bias", "module.model.visual_model.vision_model.encoder.layers.9.mlp.fc1.weight", "module.model.visual_model.vision_model.encoder.layers.9.mlp.fc1.bias", "module.model.visual_model.vision_model.encoder.layers.9.mlp.fc2.weight", "module.model.visual_model.vision_model.encoder.layers.9.mlp.fc2.bias", "module.model.visual_model.vision_model.encoder.layers.9.layer_norm2.weight", "module.model.visual_model.vision_model.encoder.layers.9.layer_norm2.bias", "module.model.visual_model.vision_model.encoder.layers.10.self_attn.k_proj.weight", "module.model.visual_model.vision_model.encoder.layers.10.self_attn.k_proj.bias", "module.model.visual_model.vision_model.encoder.layers.10.self_attn.v_proj.weight", "module.model.visual_model.vision_model.encoder.layers.10.self_attn.v_proj.bias", "module.model.visual_model.vision_model.encoder.layers.10.self_attn.q_proj.weight", "module.model.visual_model.vision_model.encoder.layers.10.self_attn.q_proj.bias", "module.model.visual_model.vision_model.encoder.layers.10.self_attn.out_proj.weight", "module.model.visual_model.vision_model.encoder.layers.10.self_attn.out_proj.bias", "module.model.visual_model.vision_model.encoder.layers.10.layer_norm1.weight", "module.model.visual_model.vision_model.encoder.layers.10.layer_norm1.bias", "module.model.visual_model.vision_model.encoder.layers.10.mlp.fc1.weight", "module.model.visual_model.vision_model.encoder.layers.10.mlp.fc1.bias", "module.model.visual_model.vision_model.encoder.layers.10.mlp.fc2.weight", "module.model.visual_model.vision_model.encoder.layers.10.mlp.fc2.bias", "module.model.visual_model.vision_model.encoder.layers.10.layer_norm2.weight", "module.model.visual_model.vision_model.encoder.layers.10.layer_norm2.bias", "module.model.visual_model.vision_model.encoder.layers.11.self_attn.k_proj.weight", "module.model.visual_model.vision_model.encoder.layers.11.self_attn.k_proj.bias", "module.model.visual_model.vision_model.encoder.layers.11.self_attn.v_proj.weight", "module.model.visual_model.vision_model.encoder.layers.11.self_attn.v_proj.bias", "module.model.visual_model.vision_model.encoder.layers.11.self_attn.q_proj.weight", "module.model.visual_model.vision_model.encoder.layers.11.self_attn.q_proj.bias", "module.model.visual_model.vision_model.encoder.layers.11.self_attn.out_proj.weight", "module.model.visual_model.vision_model.encoder.layers.11.self_attn.out_proj.bias", "module.model.visual_model.vision_model.encoder.layers.11.layer_norm1.weight", "module.model.visual_model.vision_model.encoder.layers.11.layer_norm1.bias", "module.model.visual_model.vision_model.encoder.layers.11.mlp.fc1.weight", "module.model.visual_model.vision_model.encoder.layers.11.mlp.fc1.bias", "module.model.visual_model.vision_model.encoder.layers.11.mlp.fc2.weight", "module.model.visual_model.vision_model.encoder.layers.11.mlp.fc2.bias", "module.model.visual_model.vision_model.encoder.layers.11.layer_norm2.weight", "module.model.visual_model.vision_model.encoder.layers.11.layer_norm2.bias", "module.model.visual_model.vision_model.encoder.layers.12.self_attn.k_proj.weight", "module.model.visual_model.vision_model.encoder.layers.12.self_attn.k_proj.bias", "module.model.visual_model.vision_model.encoder.layers.12.self_attn.v_proj.weight", "module.model.visual_model.vision_model.encoder.layers.12.self_attn.v_proj.bias", "module.model.visual_model.vision_model.encoder.layers.12.self_attn.q_proj.weight", "module.model.visual_model.vision_model.encoder.layers.12.self_attn.q_proj.bias", "module.model.visual_model.vision_model.encoder.layers.12.self_attn.out_proj.weight", "module.model.visual_model.vision_model.encoder.layers.12.self_attn.out_proj.bias", "module.model.visual_model.vision_model.encoder.layers.12.layer_norm1.weight", "module.model.visual_model.vision_model.encoder.layers.12.layer_norm1.bias", "module.model.visual_model.vision_model.encoder.layers.12.mlp.fc1.weight", "module.model.visual_model.vision_model.encoder.layers.12.mlp.fc1.bias", "module.model.visual_model.vision_model.encoder.layers.12.mlp.fc2.weight", "module.model.visual_model.vision_model.encoder.layers.12.mlp.fc2.bias", "module.model.visual_model.vision_model.encoder.layers.12.layer_norm2.weight", "module.model.visual_model.vision_model.encoder.layers.12.layer_norm2.bias", "module.model.visual_model.vision_model.encoder.layers.13.self_attn.k_proj.weight", "module.model.visual_model.vision_model.encoder.layers.13.self_attn.k_proj.bias", "module.model.visual_model.vision_model.encoder.layers.13.self_attn.v_proj.weight", "module.model.visual_model.vision_model.encoder.layers.13.self_attn.v_proj.bias", "module.model.visual_model.vision_model.encoder.layers.13.self_attn.q_proj.weight", "module.model.visual_model.vision_model.encoder.layers.13.self_attn.q_proj.bias", "module.model.visual_model.vision_model.encoder.layers.13.self_attn.out_proj.weight", "module.model.visual_model.vision_model.encoder.layers.13.self_attn.out_proj.bias", "module.model.visual_model.vision_model.encoder.layers.13.layer_norm1.weight", "module.model.visual_model.vision_model.encoder.layers.13.layer_norm1.bias", "module.model.visual_model.vision_model.encoder.layers.13.mlp.fc1.weight", "module.model.visual_model.vision_model.encoder.layers.13.mlp.fc1.bias", "module.model.visual_model.vision_model.encoder.layers.13.mlp.fc2.weight", "module.model.visual_model.vision_model.encoder.layers.13.mlp.fc2.bias", "module.model.visual_model.vision_model.encoder.layers.13.layer_norm2.weight", "module.model.visual_model.vision_model.encoder.layers.13.layer_norm2.bias", "module.model.visual_model.vision_model.encoder.layers.14.self_attn.k_proj.weight", "module.model.visual_model.vision_model.encoder.layers.14.self_attn.k_proj.bias", "module.model.visual_model.vision_model.encoder.layers.14.self_attn.v_proj.weight", "module.model.visual_model.vision_model.encoder.layers.14.self_attn.v_proj.bias", "module.model.visual_model.vision_model.encoder.layers.14.self_attn.q_proj.weight", "module.model.visual_model.vision_model.encoder.layers.14.self_attn.q_proj.bias", "module.model.visual_model.vision_model.encoder.layers.14.self_attn.out_proj.weight", "module.model.visual_model.vision_model.encoder.layers.14.self_attn.out_proj.bias", "module.model.visual_model.vision_model.encoder.layers.14.layer_norm1.weight", "module.model.visual_model.vision_model.encoder.layers.14.layer_norm1.bias", "module.model.visual_model.vision_model.encoder.layers.14.mlp.fc1.weight", "module.model.visual_model.vision_model.encoder.layers.14.mlp.fc1.bias", "module.model.visual_model.vision_model.encoder.layers.14.mlp.fc2.weight", "module.model.visual_model.vision_model.encoder.layers.14.mlp.fc2.bias", "module.model.visual_model.vision_model.encoder.layers.14.layer_norm2.weight", "module.model.visual_model.vision_model.encoder.layers.14.layer_norm2.bias", "module.model.visual_model.vision_model.encoder.layers.15.self_attn.k_proj.weight", "module.model.visual_model.vision_model.encoder.layers.15.self_attn.k_proj.bias", "module.model.visual_model.vision_model.encoder.layers.15.self_attn.v_proj.weight", "module.model.visual_model.vision_model.encoder.layers.15.self_attn.v_proj.bias", "module.model.visual_model.vision_model.encoder.layers.15.self_attn.q_proj.weight", "module.model.visual_model.vision_model.encoder.layers.15.self_attn.q_proj.bias", "module.model.visual_model.vision_model.encoder.layers.15.self_attn.out_proj.weight", "module.model.visual_model.vision_model.encoder.layers.15.self_attn.out_proj.bias", "module.model.visual_model.vision_model.encoder.layers.15.layer_norm1.weight", "module.model.visual_model.vision_model.encoder.layers.15.layer_norm1.bias", "module.model.visual_model.vision_model.encoder.layers.15.mlp.fc1.weight", "module.model.visual_model.vision_model.encoder.layers.15.mlp.fc1.bias", "module.model.visual_model.vision_model.encoder.layers.15.mlp.fc2.weight", "module.model.visual_model.vision_model.encoder.layers.15.mlp.fc2.bias", "module.model.visual_model.vision_model.encoder.layers.15.layer_norm2.weight", "module.model.visual_model.vision_model.encoder.layers.15.layer_norm2.bias", "module.model.visual_model.vision_model.encoder.layers.16.self_attn.k_proj.weight", "module.model.visual_model.vision_model.encoder.layers.16.self_attn.k_proj.bias", "module.model.visual_model.vision_model.encoder.layers.16.self_attn.v_proj.weight", "module.model.visual_model.vision_model.encoder.layers.16.self_attn.v_proj.bias", "module.model.visual_model.vision_model.encoder.layers.16.self_attn.q_proj.weight", "module.model.visual_model.vision_model.encoder.layers.16.self_attn.q_proj.bias", "module.model.visual_model.vision_model.encoder.layers.16.self_attn.out_proj.weight", "module.model.visual_model.vision_model.encoder.layers.16.self_attn.out_proj.bias", "module.model.visual_model.vision_model.encoder.layers.16.layer_norm1.weight", "module.model.visual_model.vision_model.encoder.layers.16.layer_norm1.bias", "module.model.visual_model.vision_model.encoder.layers.16.mlp.fc1.weight", "module.model.visual_model.vision_model.encoder.layers.16.mlp.fc1.bias", "module.model.visual_model.vision_model.encoder.layers.16.mlp.fc2.weight", "module.model.visual_model.vision_model.encoder.layers.16.mlp.fc2.bias", "module.model.visual_model.vision_model.encoder.layers.16.layer_norm2.weight", "module.model.visual_model.vision_model.encoder.layers.16.layer_norm2.bias", "module.model.visual_model.vision_model.encoder.layers.17.self_attn.k_proj.weight", "module.model.visual_model.vision_model.encoder.layers.17.self_attn.k_proj.bias", "module.model.visual_model.vision_model.encoder.layers.17.self_attn.v_proj.weight", "module.model.visual_model.vision_model.encoder.layers.17.self_attn.v_proj.bias", "module.model.visual_model.vision_model.encoder.layers.17.self_attn.q_proj.weight", "module.model.visual_model.vision_model.encoder.layers.17.self_attn.q_proj.bias", "module.model.visual_model.vision_model.encoder.layers.17.self_attn.out_proj.weight", "module.model.visual_model.vision_model.encoder.layers.17.self_attn.out_proj.bias", "module.model.visual_model.vision_model.encoder.layers.17.layer_norm1.weight", "module.model.visual_model.vision_model.encoder.layers.17.layer_norm1.bias", "module.model.visual_model.vision_model.encoder.layers.17.mlp.fc1.weight", "module.model.visual_model.vision_model.encoder.layers.17.mlp.fc1.bias", "module.model.visual_model.vision_model.encoder.layers.17.mlp.fc2.weight", "module.model.visual_model.vision_model.encoder.layers.17.mlp.fc2.bias", "module.model.visual_model.vision_model.encoder.layers.17.layer_norm2.weight", "module.model.visual_model.vision_model.encoder.layers.17.layer_norm2.bias", "module.model.visual_model.vision_model.encoder.layers.18.self_attn.k_proj.weight", "module.model.visual_model.vision_model.encoder.layers.18.self_attn.k_proj.bias", "module.model.visual_model.vision_model.encoder.layers.18.self_attn.v_proj.weight", "module.model.visual_model.vision_model.encoder.layers.18.self_attn.v_proj.bias", "module.model.visual_model.vision_model.encoder.layers.18.self_attn.q_proj.weight", "module.model.visual_model.vision_model.encoder.layers.18.self_attn.q_proj.bias", "module.model.visual_model.vision_model.encoder.layers.18.self_attn.out_proj.weight", "module.model.visual_model.vision_model.encoder.layers.18.self_attn.out_proj.bias", "module.model.visual_model.vision_model.encoder.layers.18.layer_norm1.weight", "module.model.visual_model.vision_model.encoder.layers.18.layer_norm1.bias", "module.model.visual_model.vision_model.encoder.layers.18.mlp.fc1.weight", "module.model.visual_model.vision_model.encoder.layers.18.mlp.fc1.bias", "module.model.visual_model.vision_model.encoder.layers.18.mlp.fc2.weight", "module.model.visual_model.vision_model.encoder.layers.18.mlp.fc2.bias", "module.model.visual_model.vision_model.encoder.layers.18.layer_norm2.weight", "module.model.visual_model.vision_model.encoder.layers.18.layer_norm2.bias", "module.model.visual_model.vision_model.encoder.layers.19.self_attn.k_proj.weight", "module.model.visual_model.vision_model.encoder.layers.19.self_attn.k_proj.bias", "module.model.visual_model.vision_model.encoder.layers.19.self_attn.v_proj.weight", "module.model.visual_model.vision_model.encoder.layers.19.self_attn.v_proj.bias", "module.model.visual_model.vision_model.encoder.layers.19.self_attn.q_proj.weight", "module.model.visual_model.vision_model.encoder.layers.19.self_attn.q_proj.bias", "module.model.visual_model.vision_model.encoder.layers.19.self_attn.out_proj.weight", "module.model.visual_model.vision_model.encoder.layers.19.self_attn.out_proj.bias", "module.model.visual_model.vision_model.encoder.layers.19.layer_norm1.weight", "module.model.visual_model.vision_model.encoder.layers.19.layer_norm1.bias", "module.model.visual_model.vision_model.encoder.layers.19.mlp.fc1.weight", "module.model.visual_model.vision_model.encoder.layers.19.mlp.fc1.bias", "module.model.visual_model.vision_model.encoder.layers.19.mlp.fc2.weight", "module.model.visual_model.vision_model.encoder.layers.19.mlp.fc2.bias", "module.model.visual_model.vision_model.encoder.layers.19.layer_norm2.weight", "module.model.visual_model.vision_model.encoder.layers.19.layer_norm2.bias", "module.model.visual_model.vision_model.encoder.layers.20.self_attn.k_proj.weight", "module.model.visual_model.vision_model.encoder.layers.20.self_attn.k_proj.bias", "module.model.visual_model.vision_model.encoder.layers.20.self_attn.v_proj.weight", "module.model.visual_model.vision_model.encoder.layers.20.self_attn.v_proj.bias", "module.model.visual_model.vision_model.encoder.layers.20.self_attn.q_proj.weight", "module.model.visual_model.vision_model.encoder.layers.20.self_attn.q_proj.bias", "module.model.visual_model.vision_model.encoder.layers.20.self_attn.out_proj.weight", "module.model.visual_model.vision_model.encoder.layers.20.self_attn.out_proj.bias", "module.model.visual_model.vision_model.encoder.layers.20.layer_norm1.weight", "module.model.visual_model.vision_model.encoder.layers.20.layer_norm1.bias", "module.model.visual_model.vision_model.encoder.layers.20.mlp.fc1.weight", "module.model.visual_model.vision_model.encoder.layers.20.mlp.fc1.bias", "module.model.visual_model.vision_model.encoder.layers.20.mlp.fc2.weight", "module.model.visual_model.vision_model.encoder.layers.20.mlp.fc2.bias", "module.model.visual_model.vision_model.encoder.layers.20.layer_norm2.weight", "module.model.visual_model.vision_model.encoder.layers.20.layer_norm2.bias", "module.model.visual_model.vision_model.encoder.layers.21.self_attn.k_proj.weight", "module.model.visual_model.vision_model.encoder.layers.21.self_attn.k_proj.bias", "module.model.visual_model.vision_model.encoder.layers.21.self_attn.v_proj.weight", "module.model.visual_model.vision_model.encoder.layers.21.self_attn.v_proj.bias", "module.model.visual_model.vision_model.encoder.layers.21.self_attn.q_proj.weight", "module.model.visual_model.vision_model.encoder.layers.21.self_attn.q_proj.bias", "module.model.visual_model.vision_model.encoder.layers.21.self_attn.out_proj.weight", "module.model.visual_model.vision_model.encoder.layers.21.self_attn.out_proj.bias", "module.model.visual_model.vision_model.encoder.layers.21.layer_norm1.weight", "module.model.visual_model.vision_model.encoder.layers.21.layer_norm1.bias", "module.model.visual_model.vision_model.encoder.layers.21.mlp.fc1.weight", "module.model.visual_model.vision_model.encoder.layers.21.mlp.fc1.bias", "module.model.visual_model.vision_model.encoder.layers.21.mlp.fc2.weight", "module.model.visual_model.vision_model.encoder.layers.21.mlp.fc2.bias", "module.model.visual_model.vision_model.encoder.layers.21.layer_norm2.weight", "module.model.visual_model.vision_model.encoder.layers.21.layer_norm2.bias", "module.model.visual_model.vision_model.encoder.layers.22.self_attn.k_proj.weight", "module.model.visual_model.vision_model.encoder.layers.22.self_attn.k_proj.bias", "module.model.visual_model.vision_model.encoder.layers.22.self_attn.v_proj.weight", "module.model.visual_model.vision_model.encoder.layers.22.self_attn.v_proj.bias", "module.model.visual_model.vision_model.encoder.layers.22.self_attn.q_proj.weight", "module.model.visual_model.vision_model.encoder.layers.22.self_attn.q_proj.bias", "module.model.visual_model.vision_model.encoder.layers.22.self_attn.out_proj.weight", "module.model.visual_model.vision_model.encoder.layers.22.self_attn.out_proj.bias", "module.model.visual_model.vision_model.encoder.layers.22.layer_norm1.weight", "module.model.visual_model.vision_model.encoder.layers.22.layer_norm1.bias", "module.model.visual_model.vision_model.encoder.layers.22.mlp.fc1.weight", "module.model.visual_model.vision_model.encoder.layers.22.mlp.fc1.bias", "module.model.visual_model.vision_model.encoder.layers.22.mlp.fc2.weight", "module.model.visual_model.vision_model.encoder.layers.22.mlp.fc2.bias", "module.model.visual_model.vision_model.encoder.layers.22.layer_norm2.weight", "module.model.visual_model.vision_model.encoder.layers.22.layer_norm2.bias", "module.model.visual_model.vision_model.encoder.layers.23.self_attn.k_proj.weight", "module.model.visual_model.vision_model.encoder.layers.23.self_attn.k_proj.bias", "module.model.visual_model.vision_model.encoder.layers.23.self_attn.v_proj.weight", "module.model.visual_model.vision_model.encoder.layers.23.self_attn.v_proj.bias", "module.model.visual_model.vision_model.encoder.layers.23.self_attn.q_proj.weight", "module.model.visual_model.vision_model.encoder.layers.23.self_attn.q_proj.bias", "module.model.visual_model.vision_model.encoder.layers.23.self_attn.out_proj.weight", "module.model.visual_model.vision_model.encoder.layers.23.self_attn.out_proj.bias", "module.model.visual_model.vision_model.encoder.layers.23.layer_norm1.weight", "module.model.visual_model.vision_model.encoder.layers.23.layer_norm1.bias", "module.model.visual_model.vision_model.encoder.layers.23.mlp.fc1.weight", "module.model.visual_model.vision_model.encoder.layers.23.mlp.fc1.bias", "module.model.visual_model.vision_model.encoder.layers.23.mlp.fc2.weight", "module.model.visual_model.vision_model.encoder.layers.23.mlp.fc2.bias", "module.model.visual_model.vision_model.encoder.layers.23.layer_norm2.weight", "module.model.visual_model.vision_model.encoder.layers.23.layer_norm2.bias", "module.model.visual_model.vision_model.post_layernorm.weight", "module.model.visual_model.vision_model.post_layernorm.bias", "module.model.text_hidden_fcs.0.0.weight", "module.model.text_hidden_fcs.0.0.bias", "module.model.visual_embeddings.weight", "module.model.visual_embeddings.bias", "module.model.visual_fc.weight", "module.model.visual_fc.bias".
Unexpected key(s) in state_dict: "model.logit_scale", "model.text_hidden_fcs.0.0.bias", "model.text_hidden_fcs.0.0.weight", "model.visual_embeddings.bias", "model.visual_embeddings.weight", "model.visual_fc.bias", "model.visual_fc.weight", "ret_input_embeddings.weight".
weights = {'model.logit_scale':'module.model.logit_scale',
'model.text_hidden_fcs.0.0.bias':'module.model.text_hidden_fcs.0.0.bias',
'model.text_hidden_fcs.0.0.weight':'module.model.text_hidden_fcs.0.0.weight',
'model.visual_embeddings.bias':'module.model.visual_embeddings.bias',
'model.visual_embeddings.weight':'module.model.visual_embeddings.weight',
'model.visual_fc.bias':'module.model.visual_fc.bias',
'model.visual_fc.weight':'module.model.visual_fc.weight',
'ret_input_embeddings.weight':'module.model.input_embeddings.weight'}
#write
with open('.../fromage/runs/fromage_exp/model_args.json', 'r') as f:
model_kwargs = json.load(f)
ret_token_idx = model_kwargs['retrieval_token_idx']
for k,v in weights.items():
if k == 'ret_input_embeddings.weight':
checkpoint2['state_dict'][v][ret_token_idx:ret_token_idx+1, :] = checkpoint1['state_dict'][k]
else:
checkpoint2['state_dict'][v] = checkpoint1['state_dict'][k]
print(k)
torch.save(checkpoint2, '.../fromage/fromage_model/test_model/ckpt' + '.pth.tar')
'checkpoint1' is your pretrained_ckpt(prune_model) and 'checkpoint2' is just train(epoch=1) model. And then there is no any error now. But I want to know this is right way. Can you check this?
This looks correct to me, it's essential inverting the pruning process.
Thank you for your kind and quick reply!
Hi, I'm trying fine-tuning my dataset on your FROMAGe. I tried to refer to your kind explanation on README.md as follows on NVIDIA A100-SXM4-40GB(GCP).
But I got error like below.
Can you help me, please🥺?
ps. I replaced the file path before '/fromage' with '...' because there are my personal information.