epruesse / SINA

SINA - Reference based multiple sequence alignment
https://sina.readthedocs.io
GNU General Public License v3.0
40 stars 4 forks source link

sina keeps rebuilding pt-server from scratch #90

Closed IBG-5-KIT closed 3 years ago

IBG-5-KIT commented 4 years ago

I am running SINA v1.2.11 (revision 21227). I downloaded the "Ref NR 99" arb database from SILVA as reference db when i run sina as follows (specifying the silva database under "--ptdb" and "--search-db"): sina -i LAP_H22_16S.fasta -o aligned222.test222.fasta --ptdb /opt/db/sina/16S_SILVA_138_SSURef_NR99_05_01_20_opt.arb --search --lca-fields tax_slv --lca-quorum 0.8 --search-kmer-len 10 --search-no-fast --search-db /opt/db/sina/16S_SILVA_138_SSURef_NR99_05_01_20_opt.arb

It, understandably, builds a new pt-server from that database, naming it /opt/db/sina/16S_SILVA_138_SSURef_NR99_05_01_20_opt.arb.pt in this case. However, when i rerun it, it seems to keep rebuilding that pt-server instead of reusing the one already built:

Building PT-Server for alignment 'ali_16s'...
Database contains 510984 species
Progress: Preparing sequence data
...................................................................... [  8.3%]
...................................................................... [ 16.7%]
...................................................................... [ 25.0%]
...................................................................... [ 33.3%]
...................................................................... [ 41.7%]
...................................................................... [ 50.0%]
...................................................................... [ 58.3%]
...................................................................... [ 66.7%]
...................................................................... [ 75.0%]
...................................................................... [ 83.3%]
...................................................................... [ 91.7%]
...................................................................... [100.0%]
[done]

Since this takes a LOT of time at each call of the tool, I would like to use the pt server that was already built. However, directly specifying that pt-server file (/opt/db/sina/16S_SILVA_138_SSURef_NR99_05_01_20_opt.arb.pt) yields the following error:

PT server index out of date. Rebuilding...
ARB: Loading '/opt/db/sina/16S_SILVA_138_SSURef_NR99_05_01_20_opt.arb.pt'
Unexpected content '' in line 2

How do I build and specify a pt server correctly for Sina?

epruesse commented 4 years ago

@IBG-5-KIT The 1.2.11 is ancient. Could you try using the most recent release? There are a number of PT server issues that have been fixed since the 1.2.11, the most important being that you don't have to use it at all any more. The default is now a built-in search that is drastically faster.

epruesse commented 3 years ago

I'm closing this as stale.