mortazavilab / lapa

Alternative polyadenylation detection from diverse data sources such as 3'-seq, long-read and short-reads.
https://www.biorxiv.org/content/10.1101/2022.11.08.515683v1
22 stars 12 forks source link

pyrange error "ValueError: all elements of `new_shape` must be non-negative" #20

Open ArthurDondi opened 1 year ago

ArthurDondi commented 1 year ago

Hi Mihammed,

I got the following pyrange error:

Traceback (most recent call last): File "/cluster/work/bewi/members/dondia/Anaconda3/envs/snakemake/bin/lapa", line 8, in sys.exit(cli_lapa()) File "/cluster/work/bewi/members/dondia/Anaconda3/envs/snakemake/lib/python3.9/site-packages/click/core.py", line 1130, in call return self.main(args, kwargs) File "/cluster/work/bewi/members/dondia/Anaconda3/envs/snakemake/lib/python3.9/site-packages/click/core.py", line 1055, in main rv = self.invoke(ctx) File "/cluster/work/bewi/members/dondia/Anaconda3/envs/snakemake/lib/python3.9/site-packages/click/core.py", line 1404, in invoke return ctx.invoke(self.callback, ctx.params) File "/cluster/work/bewi/members/dondia/Anaconda3/envs/snakemake/lib/python3.9/site-packages/click/core.py", line 760, in invoke return __callback(args, kwargs) File "/cluster/work/bewi/members/dondia/Anaconda3/envs/snakemake/lib/python3.9/site-packages/lapa/main.py", line 112, in cli_lapa lapa(alignment, fasta, annotation, chrom_sizes, output_dir, File "/cluster/work/bewi/members/dondia/Anaconda3/envs/snakemake/lib/python3.9/site-packages/lapa/lapa.py", line 497, in lapa _lapa(alignment) File "/cluster/work/bewi/members/dondia/Anaconda3/envs/snakemake/lib/python3.9/site-packages/lapa/lapa.py", line 288, in call df_all_count, sample_counts = self.counting(alignment) File "/cluster/work/bewi/members/dondia/Anaconda3/envs/snakemake/lib/python3.9/site-packages/lapa/lapa.py", line 142, in counting df_all_count, sample_counts = counter.to_df() File "/cluster/work/bewi/members/dondia/Anaconda3/envs/snakemake/lib/python3.9/site-packages/lapa/count.py", line 583, in to_df df = pd.concat([ File "/cluster/work/bewi/members/dondia/Anaconda3/envs/snakemake/lib/python3.9/site-packages/lapa/count.py", line 584, in self.build_counter(row['path']) File "/cluster/work/bewi/members/dondia/Anaconda3/envs/snakemake/lib/python3.9/site-packages/lapa/count.py", line 142, in to_df return self.to_gr().df.astype({'Chromosome': 'str', 'Strand': 'str'}) File "/cluster/work/bewi/members/dondia/Anaconda3/envs/snakemake/lib/python3.9/site-packages/lapa/count.py", line 136, in to_gr return pr.PyRanges(df).count_overlaps( File "/cluster/work/bewi/members/dondia/Anaconda3/envs/snakemake/lib/python3.9/site-packages/pyranges/pyranges_main.py", line 1385, in count_overlaps counts = pyrange_apply(_number_overlapping, self, other, kwargs) File "/cluster/work/bewi/members/dondia/Anaconda3/envs/snakemake/lib/python3.9/site-packages/pyranges/multithreaded.py", line 231, in pyrange_apply result = call_f(function, nparams, df, odf, kwargs) File "/cluster/work/bewi/members/dondia/Anaconda3/envs/snakemake/lib/python3.9/site-packages/pyranges/multithreaded.py", line 21, in call_f return f.remote(df, odf, **kwargs) File "/cluster/work/bewi/members/dondia/Anaconda3/envs/snakemake/lib/python3.9/site-packages/pyranges/methods/coverage.py", line 26, in _number_overlapping _self_indexes, _other_indexes = oncls.all_overlaps_both(starts, ends, indexes) File "ncls/src/ncls.pyx", line 74, in ncls.src.ncls.NCLS64.all_overlaps_both File "ncls/src/ncls.pyx", line 115, in ncls.src.ncls.NCLS64.all_overlaps_both File "<__array_function__ internals>", line 5, in resize File "/cluster/work/bewi/members/dondia/Anaconda3/envs/snakemake/lib/python3.9/site-packages/numpy/core/fromnumeric.py", line 1423, in resize raise ValueError('all elements of new_shape must be non-negative') ValueError: all elements of new_shape must be non-negative

If it can help you, here is the format of one bam read :

molecule/4051_GGCAATACTCGTGACC_B900_Tum_B900_Tum 16 chr1 14424 12 406M140N69M757N108M1I44M659N159M92N198M177N56M GATTGGTGTGCCGTTTTCTCTGGAAGCCTCTTAAGAACACTGTGGCGCAGGCTGGGTGGAGCCGTCCCCCCATGGAGCACAGGCAGACAGAAGTCCCCGCCCCAGCTGTGTGGCCTCAAGCCAGCCTTCCGCTCCTTGAAGCTGGTCTCCACACAGTGCTGGTTCCGTCACCCCCTCCCAAGGAAGTAGGTCTGAGCAGCTTGTCCTGGCTGTGTCCATGTCAGAGCAACGGCCCAAGTCTGGGTCTGGGGGGGAAGGTGTCATGGAGCCCCCTACGATTCCCAGTCGTCCTCGTCCTCCTCTGCCTGTGGCTGCTGCGGTGGCGGCAGAGGAGGGATGGAGTCTGACACGCGGGCAAAGGCTCCTCCGGGCCCCTCACCAGCCCCAGGTCCTTTCCCAGAGATGCCCTTGCGCCTCATGACCAGCTTGTTGAAGAGATCCGACATCAAGTGCCCACCTTGGCTCGTGGCTCTCACTTGCTCCTGCTCCTTCTGCTGCTTCTTCTCCAGCTTTCGCTCCTTCATGCTGCGCAGCTTGGCCTTGCCGATGCCCCCAGCTTGGCGGATGGACTCTAGCAGAGTGGCCCAGCCACCGGAGGGGTCAACCACTTCCCTGGGAGCTCCCTGGACTGAAGGAGACGCGCTGCTGCTGCTGTCGTCCTGCCTGGCGCCTTGGCCTACAGGGGCCGCGGTTGAGGGTGGGAGTGGGGGTGCACTGGCCAGCACCTCAGGAGCTGGGGGTGGTGGTGGGGGCGGTGGGGGTGGTGTTAGTACCCCATCTTGTAGGTCTTGAGAGGCTCGGCTACCTCAGTGTGGAAGGTGGGCAGTTCTGGAATGGTGCCAGGGGCAGAGGGGGCAATGCCGGGGCCCAGGTCGGCAATGTACATGAGGTCGTTGGCAATGCCGGGCAGGTCAGGCAGGTAGGATGGAACATCAATCTCAGGCACCTGGCCCAGGTCTGGCACATAGAAGTAGTTCTCTGGGACCTGCTGTTCCAGCTGCTCTCTCTTGCTGATGGACAAGGGGGCATCAAACAGCTTCT * NM:i:3 ms:i:1031 AS:i:87nn:i:0 ts:A:+ tp:A:P cm:i:307 s1:i:987 s2:i:975 de:f:0.0029 rl:i:0

Let me know if you need any further detail.

Thanks for the help

defendant602 commented 1 year ago

I got the same error,

Traceback (most recent call last):
  File "/biosoft/bin/lapa", line 8, in <module>
    sys.exit(cli_lapa())
  File "/biosoft/lib/python3.7/site-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "/biosoft/lib/python3.7/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "/biosoft/lib/python3.7/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/biosoft/lib/python3.7/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "/biosoft/lib/python3.7/site-packages/lapa/main.py", line 122, in cli_lapa
    non_replicates_read_threhold=non_replicates_read_threhold)
  File "/biosoft/lib/python3.7/site-packages/lapa/lapa.py", line 497, in lapa
    _lapa(alignment)
  File "/biosoft/lib/python3.7/site-packages/lapa/lapa.py", line 288, in __call__
    df_all_count, sample_counts = self.counting(alignment)
  File "/biosoft/lib/python3.7/site-packages/lapa/lapa.py", line 142, in counting
    df_all_count, sample_counts = counter.to_df()
  File "/biosoft/lib/python3.7/site-packages/lapa/count.py", line 587, in to_df
    for _, row in self.df_alignment.iterrows()
  File "/biosoft/lib/python3.7/site-packages/lapa/count.py", line 587, in <listcomp>
    for _, row in self.df_alignment.iterrows()
  File "/biosoft/lib/python3.7/site-packages/lapa/count.py", line 142, in to_df
    return self.to_gr().df.astype({'Chromosome': 'str', 'Strand': 'str'})
  File "/biosoft/lib/python3.7/site-packages/lapa/count.py", line 139, in to_gr
    strandedness='same')
  File "/biosoft/lib/python3.7/site-packages/pyranges/pyranges_main.py", line 1385, in count_overlaps
    counts = pyrange_apply(_number_overlapping, self, other, **kwargs)
  File "/biosoft/lib/python3.7/site-packages/pyranges/multithreaded.py", line 231, in pyrange_apply
    result = call_f(function, nparams, df, odf, kwargs)
  File "/biosoft/lib/python3.7/site-packages/pyranges/multithreaded.py", line 21, in call_f
    return f.remote(df, odf, **kwargs)
  File "/biosoft/lib/python3.7/site-packages/pyranges/methods/coverage.py", line 26, in _number_overlapping
    _self_indexes, _other_indexes = oncls.all_overlaps_both(starts, ends, indexes)
  File "ncls/src/ncls.pyx", line 74, in ncls.src.ncls.NCLS64.all_overlaps_both
  File "ncls/src/ncls.pyx", line 115, in ncls.src.ncls.NCLS64.all_overlaps_both
  File "<__array_function__ internals>", line 6, in resize
  File "/biosoft/lib/python3.7/site-packages/numpy/core/fromnumeric.py", line 1425, in resize
    raise ValueError('all elements of `new_shape` must be non-negative')
ValueError: all elements of `new_shape` must be non-negative

command being used:

lapa --alignment samples.csv --fasta hg38.fa --annotation utr_added.gtf --chrom_sizes chr_sizes --output_dir lapa_output

$cat samples.csv
sample,dataset,path
T1,T,T1.bam
T2,T,T2.bam
T3,T,T3.bam
T4,T,T4.bam
C1,C,C1.bam
C2,C,C2.bam
C3,C,C3.bam
mnsmar commented 11 months ago

Any update on this? I'm getting the same error.

mnsmar commented 11 months ago

Hello @MuhammedHasan could you please provide some feedback on this issue? Several different runs fail with this error message.

MuhammedHasan commented 11 months ago

Dear @mnsmar,

Can you please share your package version with pip freeze?

mnsmar commented 11 months ago

@MuhammedHasan I attach the output of pip freeze > requirements.txt

requirements.txt

mnsmar commented 10 months ago

hi @MuhammedHasan, any update on this?

baishengjun commented 7 months ago

getting the same error. any update on this?

leetaiyi commented 7 months ago

It's an overflow error when trying to find the overlap between too many regions. I sent a PR with a workaround. Also raised an issue in the Pyranges repo which if fixed, would no longer need this workaround.

MustafaElshani commented 4 months ago

Hi @leetaiyi What was the workaround exactly I've tried almost every pyranger version to no avail

Mustafa

leetaiyi commented 4 months ago

Hi @leetaiyi What was the workaround exactly I've tried almost every pyranger version to no avail

Mustafa

Check pull request #23 that I did

MustafaElshani commented 4 months ago

Works perfectly, thank you