Open benmwebb opened 3 years ago
I suggest using X
instead of (UNK)
so that the sequence is a string of one-letter codes as defined in the dictionary.
I suggest using X instead of (UNK) so that the sequence is a string of one-letter codes as defined in the dictionary.
Works for me - so, the canonical sequence. Can this be stated in the docs then? That should reduce the possibility of people producing files with (UNK)
and friends instead.
ma_alignment.sequence
is described as "The target / template sequence in the multiple sequence alignment". But what should this look like if the target or template contains non-standard residues such as UNK? For example we have a model built using 4buj chain E as the template which contains a number of UNK residues. Shouldma_alignment.sequence
here containX
(to matchentity_poly.pdbx_seq_one_letter_code_can
in4buj.cif
) or(UNK)
(as inentity_poly.pdbx_seq_one_letter_code
) ? The latter seems more flexible but would require reader software to be a little more intelligent (since it can't assume one character = one alignment position). But since the sequence is already uniquely defined elsewhere it seems like it doesn't matter either way, just as long as it is defined.