Closed ShaneMota closed 2 years ago
Hi Shane,
Thanks for your message. Indeed the align
step is currently set up for MP2. Since MP3 switched to python 3.X, it will require some major adjustments. I will post more documentation to the repo soon.
Since align
is just a wrapper around kneaddata & MetaPhlAn, in the meantime, please follow to their guidelines at https://github.com/biobakery/kneaddata and https://github.com/biobakery/MetaPhlAn to install and run their software, and generate the MP3 alignments on your own.
From your post the commands could look ~something like this, where $ID
is the sample identifier:
Kneaddata:
kneaddata \
-i ${ID}.R1.fastq.gz \
-i ${ID}.R2.fastq.gz \
-db /ref_dbs/Homo_sapiens_Bowtie2_v0.1/Homo_sapiens \
-p 2 \
-t 15 \
--max-memory $RAM \
--output-prefix ${ID} \
--cat-final-output \
--remove-intermediate-output \
-o data/
MetaPhlAn3:
metaphlan data/${ID}.fastq.gz \
--bowtie2db /dataone/common/ref_dbs/metaphlan3/ \
--input_type fastq \
--nproc 30 \
--legacy-output \
-t rel_ab \
--bowtie2out $OUTDIR/${ID}.mp.bowtie2out \
--samout out_align/${ID}.mp.sam.bz2 \
-o out_align/${ID}.mp.profile.txt
For MetaPhlAn3, make sure to include the --legacy-output
and --samout
flags, and convert the database format for python 2.X/3.X compatibility reasons. You can then proceed with SameStr's convert
step using the MP3 alignments (.sam.bz2).
Hope this helps, Daniel
samestr align will be deprecated for future MetaPhlAn versions (>v2) due to incompatibilities between python versions. The README contains information on how to work around these changes to successfully run SameStr on MetaPhlAn v3 and higher.
Hello, Excited to use your tool for my data, but looks like I have issues with metaphlan versions and I was wondering if you are able to help.
here is my align command;
samestr align \ --input-files data/*fastq.gz \ --input-sequence-type paired \ --kneaddata-exe /opt/software/miniconda3/envs/samestr/bin/kneaddata \ --fastq-stats-exe /opt/software/miniconda3/envs/samestr/bin/fastq-stats \ --host-bowtie2db /ref_dbs/Homo_sapiens_Bowtie2_v0.1/Homo_sapiens \ --metaphlan2-exe /opt/software/miniconda3/envs/metaphlan3/bin/metaphlan \ --mpa /dataone/common/ref_dbs/metaphlan3/mpa_v30_CHOCOPhlAn_201901 \ --mpa-pkl /dataone/common/ref_dbs/metaphlan3/mpa_v30_CHOCOPhlAn_201901.pkl \ --nprocs 30 \ --output-dir out_align/
Looks like Metaphlan3 doesn't have '--mpa-pkl' option anymore as I get this below error: metaphlan: error: unrecognized arguments: --mpa_pkl /opt/metaphlan2/db_v20/mpa_v20_m200.pkl
I also tried Metaphlan2-exe but I think Metaphlan2 is not available anymore, can't find their bitbucket page or their database. I also tried to install metaphlan2 via conda but getting error while it's trying to download
Downloading https://bitbucket.org/biobakery/metaphlan2/downloads/mpa_latest
...Best, Shane