When running RoseTTAFold-All-Atom using the following input YAML config file, I observed that the inference code returned a .pdb file that contains a predicted protein structure with the first 19 residues removed (as compared to the input FASTA sequence for this protein chain). Is there a reason why this particular sequence motif would be automatically removed by the code? For context, I noticed that the sequence alignment files (e.g., generated by HHblits) seem to show that the sequence was truncated before being passed downstream (e.g., to HHblits).
Hello.
When running RoseTTAFold-All-Atom using the following input YAML config file, I observed that the inference code returned a
.pdb
file that contains a predicted protein structure with the first 19 residues removed (as compared to the input FASTA sequence for this protein chain). Is there a reason why this particular sequence motif would be automatically removed by the code? For context, I noticed that the sequence alignment files (e.g., generated by HHblits) seem to show that the sequence was truncated before being passed downstream (e.g., to HHblits).Config file input:
where
/tmp/tmpvwt59__7/XYZ_A.fasta
has the following file contents:t000_.msa.a3m
file contents:Notably, in the output
.pdb
file (as attached), the first 19 residues (i.e.,MLLLPLPLLLFLLCSRAEA
) are not present in the resulting structure.XYZ.pdb.txt