Closed Lyang556 closed 1 year ago
You will have to add FIXED labels to the pdbs before you collect them into a silent file (you can collect these after you have made the silent file but it requires a bit more Rosetta code so I won't go into that here).
You will want to iterate though the pairs of .pdb and .trb files, extract the boolean mask of designed residues and run a loop like this to add residue labels to the .pdb file:
with open('my.pdb.file', 'a') as f:
for resi in range(len(boolean_mask)):
f.write('REMARK PDBinfo-LABEL:%5s FIXED\n'%(resi+1)) # NOTE, these labels must be 1-indexed since they are meant to work with Rosetta which is 1-indexed
@nrbennet Thank you for your prompt reply.
I have written a Python script based on your suggestion, can you help me see if this script conflicts with your suggestion?
Here is my script:
import sys
import os
import glob
import numpy as np
seeds = glob.glob('diffusion/*.pdb')
for seed in seeds:
trb = np.load(seed.replace('.pdb', '.trb'), allow_pickle=True)
sample_masks = trb['mask_1d']
sample_masks = np.array(sample_masks)
fixed_positions = np.where(sample_masks == True)[0]
with open(seed, 'a') as f:
for resi in fixed_positions:
f.write('REMARK PDBinfo-LABEL:%5s FIXED\n' % (
resi + 1)) # NOTE, these labels must be 1-indexed since they are meant to work with Rosetta which is 1-indexed
I found that the generation sequences don't fix the resideus we want to fix. Maybe the '-fix_FIXED_res' isn't passed to mpnn.
There was a bug in the MPNN script where it was not correctly reading the FIXED labels. This is fixed in the PR today. I have also added a helper script which allows for the simple parsing of RFdiffusion outputs to FIXED residue labels
Thank you for taking the time to fix this bug. I have tried to use the new script. My script is
python3 mpnn_fr/dl_interface_design.py -silent r1.silent \
-output_intermediates \
-checkpoint_path mpnn_fr/ProteinMPNN/vanilla_model_weights/v_48_020.pt \
-omit_AAs 'CX' -fix_FIXED_res
and i get a error like this:
Traceback (most recent call last):
File "dl_binder_design/mpnn_fr/dl_interface_design.py", line 259, in <module>
main( pdb, silent_structure, mpnn_model, sfd_in, sfd_out )
File "dl_binder_design/mpnn_fr/dl_interface_design.py", line 202, in main
dl_design( pose, pdb, silent_structure, mpnn_model, sfd_out )
File "dl_binder_design/mpnn_fr/dl_interface_design.py", line 184, in dl_design
seqs_scores = sequence_optimize( pdbfile, chains, mpnn_model, fixed_positions_dict )
File "dl_binder_design/mpnn_fr/dl_interface_design.py", line 103, in sequence_optimize
sequences = mpnn_util.generate_sequences( model, device, feature_dict, arg_dict, masked_chains, visible_chains, fixed_positions_dict )
File "dl_binder_design/mpnn_fr/util_protein_mpnn.py", line 267, in generate_sequences
X, S, mask, lengths, chain_M, chain_encoding_all, chain_list_list, visible_list_list, masked_list_list, masked_chain_length_list_list, chain_M_pos, omit_AA_mask, residue_idx, dihedral_mask, tied_pos_list_of_lists_list, pssm_coef, pssm_bias, pssm_log_odds_all, bias_by_res_all, tied_beta= tied_featurize(
File "dl_binder_design/mpnn_fr/ProteinMPNN/protein_mpnn_utils.py", line 310, in tied_featurize
fixed_position_mask[np.array(fixed_pos_list)-1] = 0.0
TypeError: unsupported operand type(s) for -: 'dict' and 'int'
Is this a new bug?
I also found another bug.
In the dl_interface_design_multi_seq.py, in line 94.
Is it should be
sequences = mpnn_util.generate_sequences( model, device, feature_dict, arg_dict, masked_chains, visible_chains, fixed_positions_dict )
?
The latest PR fixes these. Thanks for pointing this out
Hi Nate,
As far as I know, in the 2-chain hallucination results, the positions of the residues we want to fix are not static in the generated PDB files, they change, and we can see the exact positions of these fixed residues in the trb files. So how should we pass these positions to '-fix_FIXED_res'? As you know, we usually design a lot of proteins and these results are converted to a silent file, and there is no information about these fixed residues in the silent file.
sincerely,
Lei yang