Open dhusmann opened 2 weeks ago
would be ideal to include expanded protein letter alphabet to the one found here: https://github.com/pnnl/PTMPSI/blob/fa78b8b8b22ad8f2caae1386b0fe428007753bf2/ptmpsi-awsem/required_packages/biopython/Bio/Data/SCOPData.py#L199
Support for PTMs is in the works but not available yet. We have a curated mapping for non-standard residues which partially overlaps with the BioPythin data.
It looks like the script doesn't work when running on predictions that have modified residues. For example, running on a prediction that contains several Aceytyl-Lysine (three letter code "ALY") residues gives error:
Traceback (most recent call last): File "/home/groups/ogozani/programs/AlphaBridge/define_interfaces.py", line 150, in
main()
File "/home/groups/ogozani/programs/AlphaBridge/define_interfaces.py", line 143, in main
interface_df_per_token, interface_df = define_interfaces(in_dir, mode, contact_threshold)
File "/home/groups/ogozani/programs/AlphaBridge/define_interfaces.py", line 80, in define_interfaces
list_sequence_info, rec_sequence_list, structure_sequence_list, polymer_chain_dict = FEATURE_OBJECT.extract_sequence_info()
File "/home/groups/ogozani/programs/AlphaBridge/src/module/confidance_contact_matrix.py", line 215, in extract_sequence_info
structure_sequence_list = structure.get_sequence_list()
File "/home/groups/ogozani/programs/AlphaBridge/src/module/parsers.py", line 262, in get_sequence_list
seq_dict[asym_id] += upper_protein_letters_3to1[mon_id]
KeyError: 'ALY'