Closed Finesim97 closed 5 years ago
Just saw myself, that it is still a prerelease.
@Finesim97 I indeed do need to update the package.
Any chance you can try briefly with the pre-release binaries I have on the releases package?
I was just waiting for some more people to run this on their computers. I've tested and tested, but I know from experience that no amount of testing "by the author" replaces "actual use in the field".
Sure, I have my workflow running with the pre-release right now. The classification completes much faster than before. I will rerun it with the old version and compare the results.
Thanks! Tell me if all is ok. I'll tag the 1.6.0 as soon as you do and update the Bioconda package.
@Finesim97 any news?
Yes, everything finished fine the second time I ran it. To test the performance, I added a repeat tag to my Snakemake workflow. During one of the repeats, SINA crashed with the following log:
18:14:38 [log] Loglevel set to info
18:14:38 [SINA] This is SINA 1.6.0-rc.1.
18:14:38 [libARBDB] ARB: no FastLoad File 'output/refdbs/LVA_132_SSURef_NR99.ARM' found => loading entire DB
18:14:46 [ARB I/O] Loading names map... (for "output/refdbs/LVA_132_SSURef_NR99.arb")
18:14:47 [Search (internal)] Index contains 695171 sequences (2688376 refs)
18:14:47 [alignment_stats] alignment stats for subset ssuref:archaea
18:14:47 [alignment_stats] weighted/unweighted columns = 1450/48550
18:14:47 [alignment_stats] average weight = 4.87686
18:14:47 [alignment_stats] minimum weight = 2.13805
18:14:47 [alignment_stats] maximum weight = 7.29296
18:14:47 [alignment_stats] ntaxa = 25025
18:14:47 [alignment_stats] base frequencies: na=0.243357 nu=0.320127 nc=0.238212 ng=0.198304
18:14:47 [alignment_stats] mutation frequencies: any=0.0212039 transversions=0.00822385
18:14:47 [alignment_stats] first/last weighted column=1005/43115
18:14:47 [alignment_stats] alignment stats for subset ssuref:bacteria
18:14:47 [alignment_stats] weighted/unweighted columns = 1532/48468
18:14:47 [alignment_stats] average weight = 5.2707
18:14:47 [alignment_stats] minimum weight = 2.03587
18:14:47 [alignment_stats] maximum weight = 7.67329
18:14:47 [alignment_stats] ntaxa = 592559
18:14:47 [alignment_stats] base frequencies: na=0.252611 nu=0.31454 nc=0.229764 ng=0.203085
18:14:47 [alignment_stats] mutation frequencies: any=0.0167975 transversions=0.00775564
18:14:47 [alignment_stats] first/last weighted column=1006/43241
18:14:47 [alignment_stats] alignment stats for subset ssuref:eukarya
18:14:47 [alignment_stats] weighted/unweighted columns = 1836/48164
18:14:47 [alignment_stats] average weight = 4.74158
18:14:47 [alignment_stats] minimum weight = 2.09547
18:14:47 [alignment_stats] maximum weight = 6.74351
18:14:47 [alignment_stats] ntaxa = 77585
18:14:47 [alignment_stats] base frequencies: na=0.257375 nu=0.268129 nc=0.211923 ng=0.262573
18:14:47 [alignment_stats] mutation frequencies: any=0.0292276 transversions=0.0148656
18:14:47 [alignment_stats] first/last weighted column=1006/43273
18:14:48 [SINA] Aligner ready. Processing sequences
-------------------- ARB-backtrace 'received signal 11':
/nfs2/shared/lukas_jansen_research_data/mibiNGS/sina-1.6.0-rc.1-linux/bin/../lib/libCORE.so(GBK_dump_backtrace(_IO_FILE*, char const*)+0x26)[0x7f33c6645f36]
/nfs2/shared/lukas_jansen_research_data/mibiNGS/sina-1.6.0-rc.1-linux/bin/../lib/libCORE.so(+0xff34)[0x7f33c6647f34]
/lib/x86_64-linux-gnu/libc.so.6(+0x33060)[0x7f33c616e060]
/nfs2/shared/lukas_jansen_research_data/mibiNGS/sina-1.6.0-rc.1-linux/bin/../lib/libsina.so.0(std::__detail::_Map_base<std::thread::id, std::pair<std::thread::id const, sina::timer>, std::allocator<std::pair<std::thread::id const, sina::timer> >, std::__detail::_Select1st, std::equal_to<std::thread::id>, std::hash<std::thread::id>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<false, false, true>, true>::operator[](std::thread::id&&)+0x4d)[0x7f33c6d7ba3d]
/nfs2/shared/lukas_jansen_research_data/mibiNGS/sina-1.6.0-rc.1-linux/bin/../lib/libsina.so.0(sina::kmer_search::impl::find(sina::annotated_cseq const&, std::vector<sina::search::result_item, std::allocator<sina::search::result_item> >&, unsigned int)+0x33b)[0x7f33c6d75bcb]
/nfs2/shared/lukas_jansen_research_data/mibiNGS/sina-1.6.0-rc.1-linux/bin/../lib/libsina.so.0(sina::famfinder::impl::match(std::vector<sina::search::result_item, std::allocator<sina::search::result_item> >&, sina::annotated_cseq const&)+0x3f9)[0x7f33c6d16469]
/nfs2/shared/lukas_jansen_research_data/mibiNGS/sina-1.6.0-rc.1-linux/bin/../lib/libsina.so.0(sina::famfinder::impl::operator()(sina::tray)+0x13f8)[0x7f33c6d1b638]
/nfs2/shared/lukas_jansen_research_data/mibiNGS/sina-1.6.0-rc.1-linux/bin/../lib/libsina.so.0(sina::famfinder::operator()(sina::tray const&)+0x42)[0x7f33c6d1bed2]
sina-1.6.0-rc.1-linux/bin/sina(+0x269b2)[0x55b0ce3b59b2]
sina-1.6.0-rc.1-linux/bin/sina(+0x5003c)[0x55b0ce3df03c]
sina-1.6.0-rc.1-linux/bin/sina(+0x50131)[0x55b0ce3df131]
/nfs2/shared/lukas_jansen_research_data/mibiNGS/sina-1.6.0-rc.1-linux/bin/../lib/libtbb.so.2(+0x294a9)[0x7f33c6b7e4a9]
/nfs2/shared/lukas_jansen_research_data/mibiNGS/sina-1.6.0-rc.1-linux/bin/../lib/libtbb.so.2(+0x22af8)[0x7f33c6b77af8]
/nfs2/shared/lukas_jansen_research_data/mibiNGS/sina-1.6.0-rc.1-linux/bin/../lib/libtbb.so.2(+0x21384)[0x7f33c6b76384]
/nfs2/shared/lukas_jansen_research_data/mibiNGS/sina-1.6.0-rc.1-linux/bin/../lib/libtbb.so.2(+0x1d1e4)[0x7f33c6b721e4]
/nfs2/shared/lukas_jansen_research_data/mibiNGS/sina-1.6.0-rc.1-linux/bin/../lib/libtbb.so.2(+0x1d45a)[0x7f33c6b7245a]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x74a4)[0x7f33c5f054a4]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x3f)[0x7f33c6223d0f]
-------------------- End of backtrace
[Terminating with signal 11]
18:14:48 [ARB I/O] Closing ARB database '"output/refdbs/LVA_132_SSURef_NR99.arb"' ...
I wasn't able to reproduce it, as it finished multiple times without crashing. The files were on a NFS drive and I reused the generated index (from the same command).
My Snakemake rule:
# Prepare database:
rule sinaalignprep:
input:
db=rules.downlaodnrSSU.output
output:
outerdir+"/refdbs/LVA_132_SSURef_NR99.sidx"
log:
outerdir+"/logs/sina_silva_prep.log"
conda: "envs/tooling.yml"
shell:
"echo \">Testing\nAAAAAAAAAAAAAAAAAAAAAAAAAAAAA\n\" |sina-1.6.0-rc.1-linux/bin/sina -r {input.db} --fs-engine internal > {log}"
# Align the OTUs to the silva nr db using SINA
rule sinaalign:
input:
rules.sinaalignprep.output,
db=rules.downlaodnrSSU.output,
toalign=rules.deblur.output.referencehitseqs, # Remember to change the BIOM file as well!,
output:
fasta=outdir+"/sina_silva_aligned_otus.fasta",
csv=outdir+"/sina_silva_aligned_otus.csv"
params:
minsim=0.7
log:
outdir+"/logs/sina_silva.log"
benchmark:
repeat(outdir+"/benchmark/sina_silva.txt",5)
conda: "envs/tooling.yml"
threads: 32
shell:
"sina-1.6.0-rc.1-linux/bin/sina -i {input.toalign} -o {output.fasta} -r {input.db} -S --meta-fmt csv -v --search-min-sim={params.minsim} --lca-fields tax_slv,tax_embl,tax_gg,tax_rdp,tax_gg -p {threads} --fs-engine internal > {log} 2>&1"
rule benchSina:
input:
expand(rules.sinaalign.benchmark,group=groups)
Thanks! I'll try to look into that. At least it gave a stack trace - that helps a little. It's a concurrency thing, so really hard to reproduce. Looks like it happened in the timer code I use to figure out where SINA spends it's time to help me optimize the important bits. I might just take that out for the release binaries.
Cleaned trace:
std::__detail::_Map_base<std::thread::id, std::pair<std::thread::id const, sina::timer>>::operator[]
sina::kmer_search::impl::find()
sina::famfinder::impl::match(std::vector<sina::search::result_item>&, sina::annotated_cseq const&)
sina::famfinder::impl::operator()(sina::tray)
Ok. I hope that's fixed. 1.6.0 is out.
Thank you very much, glad I could help a little bit.
Am 27.04.2019 um 06:05 schrieb Elmar Pruesse notifications@github.com:
Closed #69.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.
Yes, you did. Thanks. I was able to fix the above problem and get it into the 1.6.0. Perhaps I can get away without a 1.6.1 again, but you never know.
Hi, I am really excited to take the new search engine for a spin, but I saw that the Bioconda recipe isn't updated yet. I am sorry if this is the wrong repository for posting that. Is it enough to update the hash and version variable?
Best wishes for your easter weekend