Error message from split_bam.py "TypeError: an integer is required"

petitmingchang commented 1 year ago

Hi, I ran the command shown on the website:

split_bam.py --threads=8 -i Input.sorted.bam -o Input &

Then got a error message as follows:

Traceback (most recent call last): File "/home/petitming/miniconda3/envs/python3/bin/split_bam.py", line 231, in main() File "/home/petitming/miniconda3/envs/python3/bin/split_bam.py", line 228, in main divided_bam(bam_file = options.bam_file, outfile = options.out_prefix, q_cut= options.map_qual, chr_prefix= options.chr_prefix, threads = options.n_thread) File "/home/petitming/miniconda3/envs/python3/bin/split_bam.py", line 184, in divided_bam aligned_read.reference_id = lookup_refid(human_header, read1_chr) File "pysam/libcalignedsegment.pyx", line 1252, in pysam.libcalignedsegment.AlignedSegment.reference_id.set File "pysam/libcalignmentfile.pyx", line 509, in pysam.libcalignmentfile.AlignmentHeader.is_valid_tid TypeError: an integer is required

My python version is Python 3.6.15. Is is possible to fix the issue? Thank you so much!

liguowang commented 1 year ago

Hello, In your "Input.sorted.bam" file, did you map reads to the composite reference genome (for example, human + fly) ? When you make this composite ref genome, did you add a prefix to the fly's chromosome IDs (for example, change fly's chr2L to "dm6_chr2L)? If you did not do this, it is impossible to split the BAM file as we cannot distinguish human and fly's chromosome (for example, chrX, chrY, and chr4 exist in both human and fly's genome files).

please follow this tutorial: https://spiker.readthedocs.io/en/latest/walkthrough.html#map-reads-to-the-composite-reference-genome

petitmingchang commented 1 year ago

@liguowang

Thank you for your prompt reply! I followed the protocol described in the paper published in STAR PROTOCOLS, https://star-protocols.cell.com/protocols/829. I used a composite ref genome (mouse + yeast). Besides, I did add a prefix to the yeast chromosome IDs (sacCer3_) in the composite genome sequence file. Did I miss any step to run the process?

Yao-Ming

liguowang commented 1 year ago

Then you need to specify "-p sacCer3_" to "split_bam.py".

-p CHR_PREFIX, --exo-prefix=CHR_PREFIX Prefix added to the exogenous chromosome IDs. For example. ‘chr2L’ -> ‘dm6chr2L’. default=dm6

petitmingchang commented 1 year ago

@liguowang

Thank you for your help. It works fine now!

liguowang / spiker

Error message from split_bam.py "TypeError: an integer is required" #2