Closed tkosciol closed 7 years ago
Wait until I generate sample result files
Sample results are on Barnacle in /projects/microprot/benchmarking/pdb_search
.
Current way to proceed:
1) read 1st result from .out
file and see if it's below the E-value threshold (default=0.001).
2) Find residue range covered by the hit.
3) Go down the list as long as results are below E-val threshold. If there's a non overlapping hit with the first result add it to the results list.
return: 1) list of PDB ids and their corresponding query coverage 2) all sequences (n >= 40 residues) not covered by the PDB
Hey Tomasz, what do you mean with "query coverage"? A float number i.e. the percentage of the query sequence length covered by a hit, or the matching sub-sequence between hit and query?
matching sub-sequence between hit and query, e.g. (1abcA, 1-100), meaning structure 1abcA matches our query between residues 1 and 100.
please see PR #19
@sjanssen2 we can close this issue, right?
jepp. Done :-)
write parser for pHMMer to identify fragments of input sequence matching PDB.