Closed chanqian18 closed 8 months ago
Attached is a result where alignment length exceeds both, which I forgot to attach in the original post.
We produce local alignments, the alnlen is the total length of the alignment including gaps for deletions or insertions. Coverage is the faction of residues covered by either query qcov
or target tcov
.
I'm working with Christine Orengo and we're trying to do some scans of AlphaFold domains against PDB chains. I am a bit confused about which coverage settings/parameters (-- cov-mode?) to use on Foldseek-TMalign in order to get only hits that cover at least 60% of the domain I use as a query. qcov, as in the original post, doesn't seem to be appropriate to me (as the target is 50 residues, the query is 249 residues, but qcov is 84.7%).
Sorry if i misunderstood any of the documentation. Thank you for your time!
@chanqian18 to annotate the alphafold domains with cath domains I would recommend using
foldseek search afdb cath afdb_cath_aln tmp --max-seqs 10000 --cov-mode 1 -c 0.6
Thank you for your quick replies!
As of now, we were not yet trying to match AlphaFold predicted domains to CATH domains, rather to PDB domains. We would like to avoid the matching of small regions in PDB to the parts query (where target is much shorter than the query); and would like at least 60% of the query be covered by the alignment with a target. So, would --cov-mode 2 -c 0.6 be appropriate?
Yes, --cov-mode 2 -c 0.6
is right. Did this work for you?
In Foldseek outputs, qcov and tcov are the aligned parts of the query and target respectively over length of the sequence. For my use case, it could be useful to have the overlap of the target and query, as I would like to filter for only results where a certain fraction (0.6) of the query is covered by the alignment. In some cases, it seems that the alignment length is longer than the target length (image 1), making the filter qcov>0.6 miss these results. These alignments give high alntmscore, but only align to small regions (perhaps one helix, image 2). Is this an expected behaviour of foldseek? How is alignment length calculated?
Your Environment'
Foldseek parameters: foldseek search temp -s 9 --alignment-type 1 -a