Closed DenaEnnis closed 1 year ago
Hi @DenaEnnis,
Thanks for your interest in SameStr!
Which MetaPhlAn db version are you using? Could you also provide the commands that you used, including for samestr db
?
Hi, I am using metaphlan 3. I changed the metaphlan text output to have a one line header and two columns. I did not change the sam output, but I didn't see that it matters (does it?).
My commands for samestr db
:
import pickle
import bz2
mpa_pkl_file = 'mpa_v296_CHOCOPhlAn_201901.pkl'
mpa_pkl = pickle.load(bz2.BZ2File(mpa_pkl_file))
f = bz2.BZ2File(mpa_pkl_file.replace('.pkl', '.py2.pkl'), 'wb')
pickle.dump(mpa_pkl, f, protocol = 0)
My commands for samestr convert
:
samestr convert \
--input-files /sci/labs/morani/morani/icore-data/lab/Projects/Dena/Sequencing_results/Exp22_27/mpa_sams/*sam.bz2 \
--marker-dir /sci/labs/morani/morani/icore-data/lab/Projects/Dena/Sequencing_results/Exp22_27/samestr/ \
--output-dir out_convert/ \
--mp-profiles-dir /sci/labs/morani/morani/icore-data/lab/Projects/Dena/Sequencing_results/Exp22_27/samestr/mpa/ \
--nproc 10
Thank you
Your commands look good, you are just missing one step. Currently, you have implemented the additional compatibility notes for using samestr db
with MetaPhlAn ≥3.
You now need to actually run the samestr db command to set up the database:
samestr db \
--mpa-pkl mpa_v30_CHOCOPhlAn_201901.py2.pkl \
--mpa-markers mpa_v30_CHOCOPhlAn_201901.fna \
--output-dir marker_db/
This will create and format the gene_files
that you were missing and were getting stuck on initially. After that, you can continue with samestr convert
as you have, using --marker-dir marker_db/
as an option.
Note that --output-dir marker_db/
is used as an example - you can name the output directory as you like, but make sure to specify it in the next steps accordingly.
Let me know if it works.
It worked, thank you! Sorry I missed that part.
Hi, Thanks for this great tool.
I am trying to run
convert
but encountering a problem. For each sample while running I get a message: 'Running: cat for 0 gene_files' and then the run gets stuck there, no much how many threads and time I give it.I converted my metaphlan database as instructed and also checked it looks like it should be.
Any suggestions? Thank you