ratschlab / spladder

Tool for the detection and quantification of alternative splicing events from RNA-Seq data.
Other
103 stars 33 forks source link

Index error for spladder build --event-types intron_retention #153

Closed wir963 closed 2 years ago

wir963 commented 2 years ago

Description

Hey Andre,

One more for you. May be related to #152. I am trying to call intron_retention events. My calls work for exon_skip events and alt3prime and alt5prime events.

What I Did

spladder build --bams {input.alignments} --annotation {GTF_FILE} --outdir output/{wildcards.cohort} --event-types intron_retention 

Here's the output

Traceback (most recent call last):
  File "/gpfs/gsfs9/users/Robinson-SB/SplAdder/.snakemake/conda/909405aaa48489b42af5752a614b75fe/bin/spladder", line 8, in <module>
    sys.exit(main())
  File "/gpfs/gsfs9/users/Robinson-SB/SplAdder/.snakemake/conda/909405aaa48489b42af5752a614b75fe/lib/python3.10/site-packages/spladder/spladder.py", line 229, in main
    options.func(options)
  File "/gpfs/gsfs9/users/Robinson-SB/SplAdder/.snakemake/conda/909405aaa48489b42af5752a614b75fe/lib/python3.10/site-packages/spladder/spladder_build.py", line 163, in spladder
    analyze_events(event_type, options.bam_fnames, options)
  File "/gpfs/gsfs9/users/Robinson-SB/SplAdder/.snakemake/conda/909405aaa48489b42af5752a614b75fe/lib/python3.10/site-packages/spladder/alt_splice/analyze.py", line 191, in analyze_events
    write_events_txt(fn_out_conf_txt, options.samples[sample_idx], events_all, fn_out_count, event_idx=confirmed_idx)
  File "/gpfs/gsfs9/users/Robinson-SB/SplAdder/.snakemake/conda/909405aaa48489b42af5752a614b75fe/lib/python3.10/site-packages/spladder/alt_splice/write.py", line 90, in write_events_txt
    psi = psi_chunk[:, i - chunk_idx_psi[0]]
IndexError: index 3487 is out of bounds for axis 1 with size 2061

Best, Welles

akahles commented 2 years ago

Hi Welles,

Thanks for reporting. Just something general in the beginning. It looks like you are using Snakemake. In this setting it is important that the output directory for the individual SplAdder tasks stays the same, as some of the information is re-used.

From the error message, it looks like the HDF5 file containing the quantification values has other dimension than expected. Would you be able to share the statistics of that file? (via h5ls -r merge_graphs_intron_retention_C3.counts.hdf5)

Thanks and Cheers,

Andre

wir963 commented 2 years ago

Hey Andre,

Yep, I found out about the important of keeping the output directories the same via some trial and error...

Here's the output

/                        Group
/conf_idx                Dataset {5392}
/confirmed               Dataset {32969}
/event_counts            Dataset {5, 6, 32969}
/event_features          Dataset {6}
/event_pos               Dataset {32969, 4}
/gene_chr                Dataset {62069}
/gene_idx                Dataset {32969}
/gene_names              Dataset {62069}
/gene_pos                Dataset {62069, 2}
/gene_strand             Dataset {62069}
/iso1                    Dataset {5, 32969}
/iso2                    Dataset {5, 32969}
/num_verified            Dataset {2, 32969}
/psi                     Dataset {5, 32969}
/samples                 Dataset {5}
/strains                 Soft Link {/samples}
/verified                Dataset {5, 2, 32969}

Let me know if you need me to run any more commands.

Best, Welles

akahles commented 2 years ago

Dear Welles,

Thanks for sharing. I have some difficulties reproducing the issue. Did you use the exact same build command as provided above or do you use a more elaborate workflow with Snakemake?

Thanks and Cheers,

Andre

wir963 commented 2 years ago

Hey Andre,

Sorry for the delay - I've been out skiing and am just catching up with everything. I re-ran the exact code above and now it's working and completing without error so I'll close this issue. Sorry for any confusion.

Best, Welles