ratschlab / spladder

Tool for the detection and quantification of alternative splicing events from RNA-Seq data.
Other
103 stars 33 forks source link

Index Error during build step #171

Open kethselly opened 2 years ago

kethselly commented 2 years ago

Description

First thanks so much for your work on this great tool!

I've been attempting to identify alternative splicing events in an RNA Seq dataset to compare differences between two Drosophila genotypes (wild-type and null). I was testing out things first so I just ran the controls (which I realize now I should run both controls and experimental conditions in the same build run, I think) - and provided a list of the .bam files in the .txt file linked below in the code (after the --bams option). The .bam files were generated using RNAStar and mapped to the Drosophila genome, using the same .gtf annotation file that is provided to Spladder below with the --annotation option.

I've been running this on a cluster (although I'm wondering now after reading one of the other comments if I needed to add a --parallel option?) and running the code below:

spladder build --bams $HOME/spladder_bam_alignment_list_all.txt --annotation data/bam_files/annotations.BDGP6.22.97.gtf --outdir spladder_out/test3_all

The code ran for about six hours and I thought things looked pretty good, but the log file output the error below. Everything before this (alt-3 primer, alt-5prime, retained introns,exon skipping) seemed to work OK. The main error seemed to come when Spladder was looking for multi-exon skipping events.

Reporting complete mult_exon_skip events:

Reporting confirmed mult_exon_skip events:
writing mult_exon_skip events in gff3 format to spladder_out/test2/merge_graphs_mult_exon_skip_C3.confirmed.gff3
writing mult_exon_skip events in flat txt format to spladder_out/test2/merge_graphs_mult_exon_skip_C3.confirmed.txt.gz
analyzing events with confidence 3

Reporting complete mutex_exons events:

Reporting confirmed mutex_exons events:
writing mutex_exons events in gff3 format to spladder_out/test2/merge_graphs_mutex_exons_C3.confirmed.gff3
writing mutex_exons events in flat txt format to spladder_out/test2/merge_graphs_mutex_exons_C3.confirmed.txt.gz
Traceback (most recent call last):
  File "/users/conda/envs/local2/bin/spladder", line 8, in <module>
    sys.exit(main())
  File "/users/.conda/envs/local2/lib/python3.10/site-packages/spladder/spladder.py", line 229, in main
    options.func(options)
  File "/users/.conda/envs/local2/lib/python3.10/site-packages/spladder/spladder_build.py", line 163, in spladder
    analyze_events(event_type, options.bam_fnames, options)
  File "/users/.conda/envs/local2/lib/python3.10/site-packages/spladder/alt_splice/analyze.py", line 191, in analyze_events
    write_events_txt(fn_out_conf_txt, options.samples[sample_idx], events_all, fn_out_count, event_idx=confirmed_idx)
  File "/users/.conda/envs/local2/lib/python3.10/site-packages/spladder/alt_splice/write.py", line 89, in write_events_txt
    counts = event_counts_chunk[:, :, i - chunk_idx_event[0]]
IndexError: index 662 is out of bounds for axis 2 with size 629

It seems like this is pretty similar to at least one of the other issues raised with Spladder.

I was able to see a partial file entitled merge_graphs_mutex_exons_C3.confirmed. When I opened this up it had many rows and columns of data but the partial row where it stopped was for a Drosophila gene called Dscam (FBgn0033159), which contains a very large number of exons and many alternative splicing events. See here (https://pubmed.ncbi.nlm.nih.gov/11606537/) - but some estimates put the number of isoforms for Dscam at close to 40,000 different isoforms. so I'm curious about whether maybe there are too many exons here for Spladder to work through? Just one thought perhaps but is there a maximum number of events that Spladder can recognize as default that perhaps I could increase?

From what I can tell, the other output files seem more complete.

I've attached a screenshot of this file opened in excel, if it's helpful.

Screen Shot 2022-08-15 at 6 54 45 AM

Any suggestions for new code to try would be very helpful. Thanks in advance!

riasc commented 1 year ago

Hi, could you resolve this?

ramikheireddine commented 9 months ago

I am running into the same problem