JetBrains-Research / span

SPAN Semi-supervised Peak Analyzer
https://doi.org/10.1093/bioinformatics/btab376
MIT License
9 stars 1 forks source link

Wrong correlation between pvalues #34

Closed olegs closed 2 years ago

olegs commented 2 years ago
[Dec 14, 2021 03:06:35] SPAN 0.14.build built on December 09, 2021
[Dec 14, 2021 03:06:35] COMMAND: analyze --islands -t bams/GSM646452_K562_Input_rep1.bam --chrom.sizes hg19.chrom.sizes --peaks span-islands/GSM646452_K562_Input_rep1_300_1e-10_5.islands --model span-islands/fit/GSM646452_K562_Input_rep1_300.span --workdir span-islands --noclip --threads 4 --bin 300 --fdr 1e-10 --gap 5
[Dec 14, 2021 03:06:36] LOG: /mnt/stripe/shpynov/2020_GSE26320/span-islands/logs/GSM646452_K562_Input_rep1_300_1e-10_5.log
[Dec 14, 2021 03:06:36] MODEL: /mnt/stripe/shpynov/2020_GSE26320/span-islands/fit/GSM646452_K562_Input_rep1_300.span
[Dec 14, 2021 03:06:36] Loading model: /mnt/stripe/shpynov/2020_GSE26320/span-islands/fit/GSM646452_K562_Input_rep1_300.span
[Dec 14, 2021 03:06:42] Completed loading model: /mnt/stripe/shpynov/2020_GSE26320/span-islands/fit/GSM646452_K562_Input_rep1_300.span
[Dec 14, 2021 03:06:42] WORKING DIR: /mnt/stripe/shpynov/2020_GSE26320/span-islands
[Dec 14, 2021 03:06:42] TREATMENT: /mnt/stripe/shpynov/2020_GSE26320/bams/GSM646452_K562_Input_rep1.bam
[Dec 14, 2021 03:06:42] CONTROL: none
[Dec 14, 2021 03:06:42] CHROM.SIZES: /mnt/stripe/shpynov/2020_GSE26320/hg19.chrom.sizes
[Dec 14, 2021 03:06:42] BIN: 300
[Dec 14, 2021 03:06:42] FRAGMENT: auto
[Dec 14, 2021 03:06:42] KEEP DUPLICATES: false
[Dec 14, 2021 03:06:42] FDR: 1.0E-10
[Dec 14, 2021 03:06:42] GAP: 5
[Dec 14, 2021 03:06:42] PEAKS: /mnt/stripe/shpynov/2020_GSE26320/span-islands/GSM646452_K562_Input_rep1_300_1e-10_5.islands
[Dec 14, 2021 03:06:42] THREADS: 4
[Dec 14, 2021 03:06:43] Computing islands: 1% (1/89), Elapsed time: 850 ms
[Dec 14, 2021 03:06:54] Computing islands: 10% (9/89), Elapsed time: 11 s, Throughput: 1 s/item, ETA: 1 min 45 s
com.google.common.util.concurrent.UncheckedExecutionException: com.google.common.util.concurrent.UncheckedExecutionException: java.lang.IllegalStateException: Wrong correlation between pvalues
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
        at java.util.concurrent.ForkJoinTask.getThrowableException(ForkJoinTask.java:593)
        at java.util.concurrent.ForkJoinTask.get(ForkJoinTask.java:1005)
        at org.jetbrains.bio.util.ExecutorExtensionsKt.awaitAll(ExecutorExtensions.kt:38)
        at org.jetbrains.bio.util.ExecutorExtensionsKt.await(ExecutorExtensions.kt:49)
        at org.jetbrains.bio.genome.containers.GenomeMap.<init>(GenomeMaps.kt:56)
        at org.jetbrains.bio.genome.containers.GenomeMapsKt.genomeMap(GenomeMaps.kt:21)
        at org.jetbrains.bio.span.IslandsKt.getIslands(Islands.kt:48)
        at org.jetbrains.bio.span.IslandsKt.getIslands$default(Islands.kt:38)
        at org.jetbrains.bio.span.SpanCLAAnalyze$analyze$1$1.invoke(SpanCLAAnalyze.kt:164)
        at org.jetbrains.bio.span.SpanCLAAnalyze$analyze$1$1.invoke(SpanCLAAnalyze.kt:83)
        at org.jetbrains.bio.util.OptionParserExtensionsKt.parse(OptionParserExtensions.kt:73)
        at org.jetbrains.bio.util.OptionParserExtensionsKt.parse$default(OptionParserExtensions.kt:24)
        at org.jetbrains.bio.span.SpanCLAAnalyze.analyze$span(SpanCLAAnalyze.kt:83)
        at org.jetbrains.bio.span.SpanCLA.main(SpanCLA.kt:79)
Caused by: com.google.common.util.concurrent.UncheckedExecutionException: java.lang.IllegalStateException: Wrong correlation between pvalues
        at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2051)
        at com.google.common.cache.LocalCache.get(LocalCache.java:3962)
../2020_GSE26320/logs/span-islands/GSM646452_K562_Input_rep1_300_1e-10_5.log
olegs commented 2 years ago

The problem is with Nan during Pearson correlation computation for small vectors. [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 4.497594881153837E-287, 1.1353207610468687E-67, 0.0, 0.0, 1.9951734631124216E-285, 2.318155812884084E-170, 0.0, 0.0, 7.193450315545617E-248, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 3.2146466920885486E-164, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0] and [0.0, 0.0, 1.9951734631124216E-285, 2.318155812884084E-170, 0.0, 0.0, 7.193450315545617E-248, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 3.2146466920885486E-164, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0].

olegs commented 2 years ago

Fixed in https://github.com/JetBrains-Research/bioinf-commons/commit/e21585c2ecc559eb93a0afb776ac2edf9aeddf15