elzbth / jitterbug

Jitterbug is a bioinformatic software that predicts insertion sites of transposable elements in a sample sequenced by short paired-end reads with respect to an assembled reference.
17 stars 8 forks source link

StopIteration Error #21

Open CGNAT opened 4 years ago

CGNAT commented 4 years ago

Hi I am trying to run Jitterbug on mapped reads. I have sorted and indexed using samtools I get the foolowing error with the following input.

jitterbug.py -o test --numCPUs 8 --bin_size 50000000 --pre_filter Alabama_coyote.bam_sorted.bam /tigress/VONHOLDT/CGL/canine_TE/canfam_rmask.gff3

Returns: Traceback (most recent call last): File "/tigress/VONHOLDT/BIN/jitterbug/jitterbug.py", line 140, in main(sys.argv[1:]) File "/tigress/VONHOLDT/BIN/jitterbug/jitterbug.py", line 130, in main args.numCPUs, args.bin_size, args.minMAPQ, generate_test_bam, args.pre_filter, args.conf_lib_stats, mem, args.min_cluster_size,args.step_one_only,args.step_two_only) File "/projects/VONHOLDT/BIN/jitterbug/Run_TE_ID_reseq_streaming.py", line 151, in run_jitterbug_streaming (isize_mean, isize_sdev, rlen_mean, rlen_sdev) = psorted_bam_reader.calculate_mean_sdev_isize(iterations) File "/projects/VONHOLDT/BIN/jitterbug/BamReader.py", line 33, in calculate_mean_sdev_isize read = bam_file.next() File "pysam/libcalignmentfile.pyx", line 1862, in pysam.libcalignmentfile.AlignmentFile.next StopIteration

MaximilianStammnitz commented 3 years ago

Hi @CGNAT, I've recently hit the same issue as you when testing the tool on a downsampled BAM file with relatively few reads. The error comes up while jitterbug tries to estimate library parameters from your sequencing file – the default configuration is to do this using the first 1 Million reads, however things break down if your file holds less than that.

To get around this, in the Run_TE_ID_reseq.py file, line 96, (and also Run_TE_ID_reseq_streaming.py, line 150) replace: iterations = 1000000 with iterations = 10000 to just use the first 10,000 reads (feel free to increase/decrease in line with your sample expectations).

Hope this helps, Max