Closed volkansevim closed 3 years ago
Thank you for your interest in our work. mbin was initially developed for PacBio RS II data. Currently it does not support bam files in the Sequel system yet. We do plan to support Sequel data down the road, and will make sure update this page as soon as an update is available. Thank you.
Thanks for the quick reply! It would be helpful to mention in the documentation that mbin currently doesn't support Sequel data.
Thanks for the quick reply! It would be helpful to mention in the documentation that mbin currently doesn't support Sequel data.
Thank you for your suggestion. We have added this clarification on the frontpage, and will update as soon as a new version supporting Sequel data is available.
I'm using the current version of mbin, on Linux & Python 2.7.17 (Suse Linux version 4.12.14-150.63-default)
I want to use mbin on a metagenome to assigning plasmids to genomes and improve binning.
I installed mbin as described in the documentation. I obtained a WGA dataset from pacbio to use as the IPD control. This is a mock metagenome containing bacteria as well as two yeast species. I mapped the reads to a concatenated reference of only the bacterial species using pbmm2 aligner v1.2.0 from Pacbio.
I run buildcontrols on the aligned bam file generated by pbbm2:
buildcontrols --procs=10 --ref=bacterial_refs_concat.fa aligned.bam
buildcontrols
fails with this output:`2021-02-26 14:20:43 [INFO] Initiating dictionary of all possible motifs... 2021-02-26 14:20:43 [INFO] - Adding 256 4-mer motifs... 2021-02-26 14:20:43 [INFO] Done: 256 possible contiguous motifs
2021-02-26 14:20:43 [INFO] - Adding 1024 5-mer motifs... 2021-02-26 14:20:43 [INFO] Done: 1536 possible contiguous motifs
2021-02-26 14:20:43 [INFO] - Adding 4096 6-mer motifs... 2021-02-26 14:20:43 [INFO] Done: 7680 possible contiguous motifs
2021-02-26 14:20:43 [INFO] - Adding bipartite motifs to search space... 2021-02-26 14:20:44 [INFO] Done: 194560 possible bipartite motifs
2021-02-26 14:20:44 [INFO] 2021-02-26 14:20:44 [INFO] Preparing to create new control data in ctrl_tmp Traceback (most recent call last): File "/global/cscratch1/sd/vsevim/software/my_p27/bin/buildcontrols", line 8, in
sys.exit(launch())
File "/global/cscratch1/sd/vsevim/software/my_p27/lib/python2.7/site-packages/mbin/controls.py", line 20, in launch
extract_controls(opts, control_aln_fn)
File "/global/cscratch1/sd/vsevim/software/my_p27/lib/python2.7/site-packages/mbin/controls.py", line 40, in extract_controls
opts = controls.scan_WGA_aligns()
File "/global/cscratch1/sd/vsevim/software/my_p27/lib/python2.7/site-packages/mbin/controls.py", line 352, in scan_WGA_aligns
reader = openIndexedAlignmentFile(self.control_aln_fn)
File "/global/cscratch1/sd/vsevim/software/my_p27/lib/python2.7/site-packages/pbcore/io/opener.py", line 54, in openIndexedAlignmentFile
return IndexedBamReader(fname, referenceFastaFname=referenceFastaFname, sharedIndex=sharedIndex)
File "/global/cscratch1/sd/vsevim/software/my_p27/lib/python2.7/site-packages/pbcore/io/align/BamIO.py", line 385, in init
super(IndexedBamReader, self).init(fname, referenceFastaFname)
File "/global/cscratch1/sd/vsevim/software/my_p27/lib/python2.7/site-packages/pbcore/io/align/BamIO.py", line 198, in init
self._loadReferenceInfo()
File "/global/cscratch1/sd/vsevim/software/my_p27/lib/python2.7/site-packages/pbcore/io/align/BamIO.py", line 73, in _loadReferenceInfo
refMD5s = [r["M5"] for r in refRecords]
KeyError: 'M5'`
It seems like the bam reader is looking for the 'M5' field in the file but, I can confirm that there is no such field in the bam header.
Do you have any suggestions on how to solve this issue?
Thanks!