Closed tkosciol closed 7 years ago
please commit changes to split_search
branch
@sjanssen2 is it just as easy as: from:
subseqs_neg = report_uncovered_subsequences(subseqs_pos, str(p),
min_subseq_len=0)
to:
subseqs_neg = report_uncovered_subsequences(subseqs_pos, str(p),
min_subseq_len=min_fragment_length)
🤔
yes it is that easy :-)
split_sequence takes probability, e-value, fragment length as criteria to group sub-sequences into "domains". I'd like to have fragment length as a hard requirement, i.e. if a subsequence is shorter than requested, it's not reported as
non_match
. Unlike other parameters (P or E-value) which help us identify if a sub-sequence fits our requirements for a domain (e.g. is a PDB domain) or not, fragment length is a requirement for "minimal interesting fragment". If the sequence is shorter than, say 10 residues, we're not interested in modelling it at all. Hope that makes sense.