dauparas / ProteinMPNN

Code for the ProteinMPNN paper
MIT License
910 stars 278 forks source link

RuntimeError: Class values must be smaller than num_classes. | protein_mpnn_utils.py & mask_size issue? #99

Open emilyrkang opened 4 months ago

emilyrkang commented 4 months ago

I'm getting the following RuntimeError when trying to run ProteinMPNN on a windows machine with Python 3.7. The method I'm using works with the example 6 inputs, but when I try to use my own protein structure 4rjj, I get the runtime error: RuntimeError: Class values must be smaller than num_classes. I've tried using the biological assembly downloaded directly from the pdb and removing the ligands and all non "ATOM" lines from the structure but I still get this error message.

The following command works, but I would like to model it as a homooligomer and fix some residues.

py protein_mpnn_run.py --path_to_model_weights "C:\ProteinMPNN\vanilla_model_weights" --pdb_path 4rjj.pdb --pdb_path_chains "A B C D" --out_folder "C:\ProteinMPNN\myoutputs\4rjj" --num_seq_per_target 20 --sampling_temp "0.1 0.2 0.3" --batch_size 1 --omit_AAs='XC'

Also, when I use the helper scripts, I need to remove the path ("C:\ProteinMPNN\my_input_PDBS\" in the text below) from the jsonl files or I get the following error message: OSError: [Errno 22] Invalid argument: 'my_outputs_directory//seqs/C:\ProteinMPNN\my_input_PDBS\4rjj.fa'

Here is my complete output:

C:\ProteinMPNN> py protein_mpnn_run.py --jsonl_path "C:\ProteinMPNN\outputs\example_6_outputs\parsed_pdbs.jsonl" --tied_positions_jsonl "C:\ProteinMPNN\outputs\example_6_outputs\tied_pdbs.jsonl" --path_to_model_weights "C:\ProteinMPNN\vanilla_model_weights" --out_folder "my_outputs_directory" --num_seq_per_target 4 --sampling_temp "0.1 0.2 0.3" --batch_size 1 --omit_AAs='XC'

chain_id_jsonl is NOT loaded

fixed_positions_jsonl is NOT loaded

pssm_jsonl is NOT loaded

omit_AA_jsonl is NOT loaded

bias_AA_jsonl is NOT loaded

bias by residue dictionary is not loaded, or not provided

discarded {'bad_chars': 0, 'too_long': 0, 'bad_seq_length': 0}

Number of edges: 48 Training noise level: 0.2A Generating sequences for: 6EHB 12 sequences of length 960 generated in 75.8997 seconds Generating sequences for: 4GYT 12 sequences of length 354 generated in 52.1981 seconds

C:\ProteinMPNN>py protein_mpnn_run.py --jsonl_path "C:\ProteinMPNN\myparsedfilesetc\parsed_pdbs.jsonl" --tied_positions_jsonl "C:\ProteinMPNN\myparsedfilesetc\tied_pdbs.jsonl" --path_to_model_weights "C:\ProteinMPNN\vanilla_model_weights" --out_folder "my_outputs_directory" --num_seq_per_target 4 --sampling_temp "0.1 0.2 0.3" --batch_size 1 --omit_AAs='XC'

chain_id_jsonl is NOT loaded

fixed_positions_jsonl is NOT loaded

pssm_jsonl is NOT loaded

omit_AA_jsonl is NOT loaded

bias_AA_jsonl is NOT loaded

bias by residue dictionary is not loaded, or not provided

discarded {'bad_chars': 0, 'too_long': 0, 'bad_seq_length': 0}

Number of edges: 48 Training noise level: 0.2A Generating sequences for: 4rjj Traceback (most recent call last): File "protein_mpnn_run.py", line 469, in main(args) File "protein_mpnn_run.py", line 331, in main sample_dict = model.tied_sample(X, randn_2, S, chain_M, chain_encoding_all, residue_idx, mask=mask, temperature=temp, omit_AAs_np=omit_AAs_np, bias_AAs_np=bias_AAs_np, chain_M_pos=chain_M_pos, omit_AA_mask=omit_AA_mask, pssm_coef=pssm_coef, pssm_bias=pssm_bias, pssm_multi=args.pssm_multi, pssm_log_odds_flag=bool(args.pssm_log_odds_flag), pssm_log_odds_mask=pssm_log_odds_mask, pssm_bias_flag=bool(args.pssm_bias_flag), tied_pos=tied_pos_list_of_lists_list[0], tied_beta=tied_beta, bias_by_res=bias_by_res_all) File "C:\ProteinMPNN\protein_mpnn_utils.py", line 1218, in tied_sample permutation_matrix_reverse = torch.nn.functional.one_hot(decoding_order, num_classes=mask_size).float() RuntimeError: Class values must be smaller than num_classes.

C:\ProteinMPNN> py protein_mpnn_run.py --jsonl_path "C:\ProteinMPNN\myparsedfilesetc\parsed_pdbs.jsonl" --tied_positions_jsonl "C:\ProteinMPNN\myparsedfilesetc\tied_pdbs.jsonl" --path_to_model_weights "C:\ProteinMPNN\vanilla_model_weights" --out_folder "my_outputs_directory" --num_seq_per_target 4 --sampling_temp "0.1 0.2 0.3" --batch_size 1 --omit_AAs='XC'

chain_id_jsonl is NOT loaded

fixed_positions_jsonl is NOT loaded

pssm_jsonl is NOT loaded

omit_AA_jsonl is NOT loaded

bias_AA_jsonl is NOT loaded

bias by residue dictionary is not loaded, or not provided

discarded {'bad_chars': 0, 'too_long': 0, 'bad_seq_length': 0}

Number of edges: 48 Training noise level: 0.2A Generating sequences for: C:\ProteinMPNN\my_input_PDBS\4rjj Traceback (most recent call last): File "protein_mpnn_run.py", line 469, in main(args) File "protein_mpnn_run.py", line 323, in main with open(ali_file, 'w') as f: OSError: [Errno 22] Invalid argument: 'my_outputs_directory//seqs/C:\ProteinMPNN\my_input_PDBS\4rjj.fa'

C:\ProteinMPNN> py protein_mpnn_run.py --jsonl_path "C:\ProteinMPNN\myparsedfilesetc\parsed_pdbs.jsonl" --tied_positions_jsonl "C:\ProteinMPNN\myparsedfilesetc\tied_pdbs.jsonl" --path_to_model_weights "C:\ProteinMPNN\vanilla_model_weights" --out_folder "my_outputs_directory" --num_seq_per_target 4 --sampling_temp "0.1 0.2 0.3" --batch_size 1 --omit_AAs='XC'

chain_id_jsonl is NOT loaded

fixed_positions_jsonl is NOT loaded

pssm_jsonl is NOT loaded

omit_AA_jsonl is NOT loaded

bias_AA_jsonl is NOT loaded

bias by residue dictionary is not loaded, or not provided

discarded {'bad_chars': 0, 'too_long': 0, 'bad_seq_length': 0}

Number of edges: 48 Training noise level: 0.2A Generating sequences for: 4rjj Traceback (most recent call last): File "protein_mpnn_run.py", line 469, in main(args) File "protein_mpnn_run.py", line 331, in main sample_dict = model.tied_sample(X, randn_2, S, chain_M, chain_encoding_all, residue_idx, mask=mask, temperature=temp, omit_AAs_np=omit_AAs_np, bias_AAs_np=bias_AAs_np, chain_M_pos=chain_M_pos, omit_AA_mask=omit_AA_mask, pssm_coef=pssm_coef, pssm_bias=pssm_bias, pssm_multi=args.pssm_multi, pssm_log_odds_flag=bool(args.pssm_log_odds_flag), pssm_log_odds_mask=pssm_log_odds_mask, pssm_bias_flag=bool(args.pssm_bias_flag), tied_pos=tied_pos_list_of_lists_list[0], tied_beta=tied_beta, bias_by_res=bias_by_res_all) File "C:\ProteinMPNN\protein_mpnn_utils.py", line 1218, in tied_sample permutation_matrix_reverse = torch.nn.functional.one_hot(decoding_order, num_classes=mask_size).float() RuntimeError: Class values must be smaller than num_classes.