Open gmteunisse opened 2 years ago
Hi thanks for this. I have been struggling with the Anacapa pipeline for days now. I need to get it working so I can run it through my eDNA data for my MSc project. One quick question, the line of code you just put there above do i run it straight on my terminal or should I copy and paste it into a script and run it once as a script? I am still new to Llinux and its commands, I have a ubuntu 20.0 LTS installed on a VM on my laptop. I tried following the tutorial that the Anacapa authors put out, however when I enter the container and run the "run-anacapa-qc.sh file I get an error. So I am hoping your line of code helps
No worries, glad I could help. Either option will work: line by line in the terminal, or copy to a .sh
script and running it with bash <script>.sh
. If you go for the latter, copy the two code blocks into two separate shell scripts and run each separately. The first will download, install and update the code, the second will run the 12S example.
Thank you so much for this it really helped. Code ran smoothly...without it I would have given up and used another pipeline I am considering The pipeline is similar but everything can be run on R. Basically Dada2 however I assign the taxonomy using assignTaxonomy or IdTaxa (both R functions from the DECIPHER package). If you have tried it or heard about it, please advice on it.
But all in all thank you for this script. I really appreciate this.
I however received slightly different results to your run on https://github.com/limey-bean/Anacapa/issues/60
.../12S/12S_taxonomy_tables/12S_ASV_taxonomy_brief.txt | 32 +-- .../12S/12S_taxonomy_tables/12S_ASV_taxonomy_detailed.txt | 32 +-- .../12S/12S_taxonomy_tables/Summary_by_percent_confidence/100/12S_ASV_raw_taxonomy_100.txt | 8 +- .../12S/12S_taxonomy_tables/Summary_by_percent_confidence/100/12S_ASV_sum_by_taxonomy_100.txt | 6 +- .../12S/12S_taxonomy_tables/Summary_by_percent_confidence/40/12S_ASV_raw_taxonomy_40.txt | 4 +- .../12S/12S_taxonomy_tables/Summary_by_percent_confidence/40/12S_ASV_sum_by_taxonomy_40.txt | 2 +- .../12S/12S_taxonomy_tables/Summary_by_percent_confidence/50/12S_ASV_raw_taxonomy_50.txt | 4 +- .../12S/12S_taxonomy_tables/Summary_by_percent_confidence/50/12S_ASV_sum_by_taxonomy_50.txt | 2 +- .../12S/12S_taxonomy_tables/Summary_by_percent_confidence/60/12S_ASV_raw_taxonomy_60.txt | 4 +- .../12S/12S_taxonomy_tables/Summary_by_percent_confidence/60/12S_ASV_sum_by_taxonomy_60.txt | 2 +- .../12S/12S_taxonomy_tables/Summary_by_percent_confidence/70/12S_ASV_raw_taxonomy_70.txt | 4 +- .../12S/12S_taxonomy_tables/Summary_by_percent_confidence/70/12S_ASV_sum_by_taxonomy_70.txt | 2 +- .../12S/12S_taxonomy_tables/Summary_by_percent_confidence/80/12S_ASV_raw_taxonomy_80.txt | 4 +- .../12S/12S_taxonomy_tables/Summary_by_percent_confidence/80/12S_ASV_sum_by_taxonomy_80.txt | 2 +- .../12S/12S_taxonomy_tables/Summary_by_percent_confidence/90/12S_ASV_raw_taxonomy_90.txt | 4 +- .../12S/12S_taxonomy_tables/Summary_by_percent_confidence/90/12S_ASV_sum_by_taxonomy_90.txt | 2 +- .../12S/12S_taxonomy_tables/Summary_by_percent_confidence/95/12S_ASV_raw_taxonomy_95.txt | 8 +- .../12S/12S_taxonomy_tables/Summary_by_percent_confidence/95/12S_ASV_sum_by_taxonomy_95.txt | 7 +- .../12S/12Sbowtie2_out/12S_bowtie2_all.sam | 702 ++++++++++++++++++++++++++----------------------- .../12S/12Sbowtie2_out/12S_bowtie2_all.sam.blca.out | 32 +-- .../12S/12Sbowtie2_out/single_read_forward_12S_end_to_end.sam | 33 ++- .../12S/12Sbowtie2_out/single_read_forward_12S_local.sam | 118 ++++----- .../12S/12Sbowtie2_out/single_read_merged_12S_end_to_end.sam | 351 ++++++++++++++----------- .../12S/12Sbowtie2_out/single_read_merged_12S_local.sam | 200 +++++++------- 24 files changed, 828 insertions(+), 737 deletions(-)
If I may ask? Should I want to run this on COI data I would have to tweak the scripts your script and the scripts that come with the Anacapa pipeline? i.e the paths, metabarcode
Glad it worked! I can vouch for DADA2's ease of use and denoising capacities (it's also part of Anacapa, actually), but I've never used it for COI taxonomy assignment - there is no default training set to use with RDP. You might be able to get some inspiration here if you do want to go down that path: https://github.com/benjjneb/dada2/issues/922.
If you want to run Anacapa on COI data, you would have to download the COI taxonomy database (or generate your own using CRUX) and then tweak some of the paths and scripts. I haven't used Anacapa since running the example, though, so I can't help you with that. I think the tutorial can help you figure out what needs to go where. All the best!
Thank you very much.. I will surely do that.
Hi all, I was trying to get the latest version of Anacapa to run through the Singularity container and ran into a few lines of code that need fixing to get Anacapa to run. The first is an erroneous (and double) print statement in line 327 in
blca_from_bowtie.py
, which results in termination of the script and therefore failure of the pipeline.The second is a problem in local mode in the
run_*_blca.sh
scripts, which pass-p ${DB}/muscle
as the muscle path toblca_from_bowtie.py
, where${DB}
points to theAnacapa_db
directory. The result is failure on line 369. This should point to the muscle path as specifiied inanacapa_config.sh
(related to issue #40 ?)As I'm not sure whether this pipeline is still being maintained, I've attached all code that is required to get Anacapa running through Singularity in local mode. NOTE: I get slightly different taxonomy annotations in the 12S example, see issue #60.
Download and modify files
Run the 12S example
Edit: added reference to issue #60.