steineggerlab / foldseek

Foldseek enables fast and sensitive comparisons of large structure sets.
https://foldseek.com
GNU General Public License v3.0
696 stars 92 forks source link

Can foldseek generate the html output with only top ten hits? #165

Open azureycy opened 11 months ago

azureycy commented 11 months ago

Hi,

Is there any command argument that can just generate the top ten hits in the html/tabular output when using foldseek search? As the command below, I already set the e-value and tmscore-threshold, but some html format outputs still have very long content.

foldseek easy-search input.pdb esmdb out.html tmp -e 1e-5 --alignment-type 1 --tmscore-threshold 0.45 --format-mode 3 --threads 20

Thank you!

tamimmurad commented 11 months ago

Hi @azureycy Also, the tmscore threshold is not obvious to what tmscore it is referring because I thought I will use it to limit the output. For example I used the following command: foldseek easy-search ma-asfv-asfvg-156.cif PDB_downloaded/PDB viral_nonviral_results/aln_pdb_nonv_0.3 tmp --alignment-type 1 --tmscore-threshold 0.3 --format-output 'query,target,evalue,qtmscore,ttmscore,alntmscore,' I got the below results: ma-asfv-asfvg-156.cif 7vep_A 4.129E-01 4.045E-01 4.832E-01 4.832E-01 ma-asfv-asfvg-156.cif 8bbe_C 2.854E-01 3.863E-01 3.604E-01 3.604E-01 Then I used the same command with a 0.2 threshould and gave more hits but confusing results that is not matching to any of the alignments tm scores available so it seems it refer to non of them: ma-asfv-asfvg-156.cif 7vep_A 4.129E-01 4.045E-01 4.832E-01 4.832E-01 ma-asfv-asfvg-156.cif 4wsl_A 3.988E-01 3.948E-01 4.508E-01 4.508E-01 ma-asfv-asfvg-156.cif 4pjq_B 3.950E-01 3.415E-01 5.065E-01 5.065E-01 I think the best for now is not to use the threshold and use different format mode to post process the results and pick top hits only