Open BSharmi opened 3 months ago
Thanks again for reviewing our code. It is great news for me to know that people in the community get interested in SurfVQMAE and attempt to reproduce the results.
Regarding your question, can you please use the infer.py
for a second try? infer_new_pdb.py
is an older version for inference and I have deleted it promptly.
If you still encounter errors, please do not hesitate to let me know!
The major difference between 'infer.py' and 'infer_new_pdb.py' is that I turned off the strict
option to false.
model.load_state_dict(ckpt['model'], strict=False)
This is because when training VAE, I built several additional blocks including tokenizer.mlp
, decoder
, and hbond_mlp
for predicting unsupervised features. But when transferring to downstream tasks like epitope prediction. Those blocks are not required, and a new classifer
is needed. So there is a mismatch when loading the pretrained model weight. Hope this explanation helps.
Thank you I just realized after the message that I should try that first :) but I still get a shape mismatch error
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/ubuntu/miniconda3/envs/SurfVQMAE/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1483, in load_state_dict
self.__class__.__name__, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for SurfaceTransformerV2:
size mismatch for surface_encoder.conv.layers.0.net_in.0.weight: copying a param with shape torch.Size([16, 26]) from checkpoint, the shape in current model is torch.Size([16, 16]).
size mismatch for surface_encoder.conv.linear_transform.0.weight: copying a param with shape torch.Size([16, 26]) from checkpoint, the shape in current model is torch.Size([16, 16]).
I have uploaded both the model weights of pretraining and fine-tuning in weight/
folder. Can you please try them again?
I encountered no trouble when loading these parameters on my computer. Perhaps I have given a model weight of the previous version. Sorry for the inconvenience.
Thank you so much for addressing both issues. I still cannot load the model, may be I am doing something wrong. Here is how I am trying to load
from src.models import get_model
import torch
device = "cuda:1"
ckpt = torch.load("/efs/home/sharmiba/SurfVQMAE_V2/VQMAE/weight/light_pretrain.pt", device)
cfg = ckpt['config']
model = get_model(cfg.model).to(device)
model.load_state_dict(ckpt['model'], strict=False)
I get the error
RuntimeError: Error(s) in loading state_dict for SurfaceTransformerV2:
size mismatch for surface_encoder.conv.layers.0.net_in.0.weight: copying a param with shape torch.Size([16, 26]) from checkpoint, the shape in current model is torch.Size([16, 16]).
size mismatch for surface_encoder.conv.linear_transform.0.weight: copying a param with shape torch.Size([16, 26]) from checkpoint, the shape in current model is torch.Size([16, 16]).
>>>
I also tried the fine tune model, the difference is I need to change cfg = ckpt['config']
to cfg = ckpt['cfg']
, copied the code below
from src.models import get_model
import torch
device = "cuda:1"
ckpt = torch.load("/efs/home/sharmiba/SurfVQMAE_V2/VQMAE/weight/light_finetune.pt", device)
cfg = ckpt['cfg']
model = get_model(cfg.model).to(device)
model.load_state_dict(ckpt['model'], strict=False)
and get the same error
RuntimeError: Error(s) in loading state_dict for SurfaceTransformerV2:
size mismatch for surface_encoder.conv.layers.0.net_in.0.weight: copying a param with shape torch.Size([16, 26]) from checkpoint, the shape in current model is torch.Size([16, 16]).
size mismatch for surface_encoder.conv.linear_transform.0.weight: copying a param with shape torch.Size([16, 26]) from checkpoint, the shape in current model is torch.Size([16, 16])
Did I do something wrong? I have updated the github repository, and downloaded the new weights and still not able to load the model :(
Sorry for bugging you again, I just want to try to predict the tokens given a structure. Is there an easy way to do that?
Thank you very much
Sorry for the late reply! I have double-checked the checkpoints and encountered the same problem.
Let me give you a comprehensive overview of those checkpoints:
I used the vae_2024_02_01__19_03_24
checkpoint for fine-tuning.
However, as you discovered, layers in dMaSIFConv_seg
are mismatched:
To understand the problem, the mismatch comes from the net_in
module in the geometry.py
script.
As far as I am concerned, I directly used dMaSIF (https://github.com/FreyrS/dMaSIF/blob/0dcc26c3c218a39d5fe26beb2e788b95fb028896/benchmark_models.py#L233) as the surface point cloud encoder and did not change its internal modules. Therefore, ideally, the size of surface_encoder.conv.layers.0.net_in.0.weight
ought to be (res_dim, hidden_dim)
, namely, (16, 16)
in this case.
I believe that in the previous SurfFormer v2, I modified the architecture of dMaSIF for pretraining but later changed the modification. However, as it was half a year ago, I forgot what the original version looks like (a lesson that I need to do better version control).
This is completely my fault and I really understand your need to predict the tokens given a structure. As a partial resolution, I provided the latest fine-tuned model weight for surface-based epitope prediction (see checkpoints/light_finetune_new.pt
), which has a consistent weight shape of (16, 16)
. I wish this can meet your needs.
If you have an urgent demand for pretrained model weight, please let me know.
Thank you that makes sense now. I am able to load the model!
I am not in a rush but it would be great if you can provide the pretrained model weight at some point, the same that you used in the paper for reproducibility,
Thank you very much, Sharmi
Hello!
Great work!
I tried to load the light weight model following
https://github.com/smiles724/VQMAE/blob/master/infer_new_pdb.py#L62C5-L62C47
using the code
and get the following error
Do I need to change something in the code to load the model?
Thank you!