opencb / opencga

An Open Computational Genomics Analysis platform for big data genomics analysis. OpenCGA is maintained and develop by its parent company Zetta Genomics. Please contact support@zettagenomics.com for bug report and feature requests.
Apache License 2.0
166 stars 97 forks source link

Unable to calculate statistics #1450

Open Mohammedhusen opened 4 years ago

Mohammedhusen commented 4 years ago

Hi Team,

Getting below error while running stats ,could you please help.

2019-12-16 15:28:43 [main] INFO  MongoDBVariantStatisticsManager:79 - ReaderQueryOptions: {"exclude":["ANNOTATION","STUDIES_STATS"],"sort":true}
2019-12-16 15:28:48 [main] INFO  DefaultVariantStatisticsManager:270 - will write stats to /mongodb/mongodbweb1/temporal_statistics/stats_qgp_cohort_all.variants.stats.json.gz
2019-12-16 15:28:48 [main] INFO  MongoDBVariantStatisticsManager:118 - Starting stats creation for cohorts [qgp_cohort_all]
2019-12-16 15:28:48 [pool-2-thread-2] ERROR ParallelTaskRunner:629 - Error processing batch 1
java.lang.NullPointerException
    at org.opencb.opencga.storage.mongodb.variant.stats.MongoDBVariantStatsCalculator.calculateStats(MongoDBVariantStatsCalculator.java:83)
    at org.opencb.opencga.storage.mongodb.variant.stats.MongoDBVariantStatsCalculator.apply(MongoDBVariantStatsCalculator.java:70)
    at org.opencb.opencga.storage.mongodb.variant.stats.MongoDBVariantStatsCalculator.apply(MongoDBVariantStatsCalculator.java:57)
    at org.opencb.commons.run.Task$1.apply(Task.java:48)
    at org.opencb.commons.run.Task$1.apply(Task.java:48)
    at org.opencb.commons.run.ParallelTaskRunner$TaskRunnable.call(ParallelTaskRunner.java:627)
    at org.opencb.commons.run.ParallelTaskRunner$TaskRunnable.call(ParallelTaskRunner.java:594)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:748)
2019-12-16 15:28:48 [pool-2-thread-1] ERROR ParallelTaskRunner:629 - Error processing batch 0
java.lang.NullPointerException
    at org.opencb.opencga.storage.mongodb.variant.stats.MongoDBVariantStatsCalculator.calculateStats(MongoDBVariantStatsCalculator.java:83)
    at org.opencb.opencga.storage.mongodb.variant.stats.MongoDBVariantStatsCalculator.apply(MongoDBVariantStatsCalculator.java:70)
    at org.opencb.opencga.storage.mongodb.variant.stats.MongoDBVariantStatsCalculator.apply(MongoDBVariantStatsCalculator.java:57)
    at org.opencb.commons.run.Task$1.apply(Task.java:48)
    at org.opencb.commons.run.Task$1.apply(Task.java:48)
    at org.opencb.commons.run.ParallelTaskRunner$TaskRunnable.call(ParallelTaskRunner.java:627)
    at org.opencb.commons.run.ParallelTaskRunner$TaskRunnable.call(ParallelTaskRunner.java:594)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:748)
2019-12-16 15:28:48 [pool-2-thread-2] WARN  ParallelTaskRunner:641 - Abort task thread on fail
2019-12-16 15:28:48 [pool-2-thread-1] WARN  ParallelTaskRunner:641 - Abort task thread on fail
2019-12-16 15:28:48 [main] WARN  ParallelTaskRunner:538 - Abort read thread on fail
2019-12-16 15:28:48 [pool-2-thread-3] ERROR ParallelTaskRunner:629 - Error processing batch 2
java.lang.NullPointerException
    at org.opencb.opencga.storage.mongodb.variant.stats.MongoDBVariantStatsCalculator.calculateStats(MongoDBVariantStatsCalculator.java:83)
    at org.opencb.opencga.storage.mongodb.variant.stats.MongoDBVariantStatsCalculator.apply(MongoDBVariantStatsCalculator.java:70)
    at org.opencb.opencga.storage.mongodb.variant.stats.MongoDBVariantStatsCalculator.apply(MongoDBVariantStatsCalculator.java:57)
    at org.opencb.commons.run.Task$1.apply(Task.java:48)
    at org.opencb.commons.run.Task$1.apply(Task.java:48)
    at org.opencb.commons.run.ParallelTaskRunner$TaskRunnable.call(ParallelTaskRunner.java:627)
    at org.opencb.commons.run.ParallelTaskRunner$TaskRunnable.call(ParallelTaskRunner.java:594)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:748)
2019-12-16 15:28:48 [pool-2-thread-3] WARN  ParallelTaskRunner:641 - Abort task thread on fail
2019-12-16 15:28:48 [main] INFO  ParallelTaskRunner:418 - read:  timeReading                  = 0.151s
2019-12-16 15:28:48 [main] INFO  ParallelTaskRunner:419 - read:  timeBlockedAtPutRead         = 0.000s
2019-12-16 15:28:48 [main] INFO  ParallelTaskRunner:420 - task;  timeBlockedAtTakeRead        = 0.702s(total)   ~0.117s/thread
2019-12-16 15:28:48 [main] INFO  ParallelTaskRunner:424 - task;  timeTaskApply                = 0.037s(total)   ~0.006s/thread
2019-12-16 15:28:48 [main] INFO  ParallelTaskRunner:428 - task;  timeBlockedAtPutWrite        = 0.000s(total)   ~0.000s/thread
2019-12-16 15:28:48 [main] INFO  ParallelTaskRunner:430 - write: timeBlockedWatingDataToWrite = 0.152s
2019-12-16 15:28:48 [main] INFO  ParallelTaskRunner:431 - write: timeWriting                  = 0.009s
2019-12-16 15:28:48 [main] INFO  ParallelTaskRunner:434 - total:                              = 0.182s
2019-12-16 15:28:48 [main] ERROR VariantStatsStorageOperation:154 - Error executing stats. Set cohorts status to INVALID
org.opencb.opencga.storage.core.exceptions.StorageEngineException: Unable to calculate statistics.
    at org.opencb.opencga.storage.mongodb.variant.stats.MongoDBVariantStatisticsManager.createStats(MongoDBVariantStatisticsManager.java:123)
    at org.opencb.opencga.storage.core.variant.stats.DefaultVariantStatisticsManager.createStats(DefaultVariantStatisticsManager.java:156)
    at org.opencb.opencga.storage.core.variant.stats.DefaultVariantStatisticsManager.calculateStatistics(DefaultVariantStatisticsManager.java:104)
    at org.opencb.opencga.storage.core.variant.VariantStorageEngine.calculateStats(VariantStorageEngine.java:511)
    at org.opencb.opencga.storage.core.manager.variant.operations.VariantStatsStorageOperation.calculateStats(VariantStatsStorageOperation.java:123)
    at org.opencb.opencga.storage.core.manager.variant.VariantStorageManager.stats(VariantStorageManager.java:313)
    at org.opencb.opencga.app.cli.analysis.executors.VariantCommandExecutor.stats(VariantCommandExecutor.java:368)
    at org.opencb.opencga.app.cli.analysis.executors.VariantCommandExecutor.execute(VariantCommandExecutor.java:125)
    at org.opencb.opencga.app.cli.analysis.AnalysisMain.privateMain(AnalysisMain.java:102)
    at org.opencb.opencga.app.cli.analysis.AnalysisMain.main(AnalysisMain.java:35)
Caused by: java.util.concurrent.ExecutionException: Error while running ParallelTaskRunner. Found 3 exceptions.
    at org.opencb.commons.run.ParallelTaskRunner.run(ParallelTaskRunner.java:438)
    at org.opencb.commons.run.ParallelTaskRunner.run(ParallelTaskRunner.java:297)
    at org.opencb.opencga.storage.mongodb.variant.stats.MongoDBVariantStatisticsManager.createStats(MongoDBVariantStatisticsManager.java:120)
    ... 9 more
Caused by: java.lang.NullPointerException
    at org.opencb.opencga.storage.mongodb.variant.stats.MongoDBVariantStatsCalculator.calculateStats(MongoDBVariantStatsCalculator.java:83)
    at org.opencb.opencga.storage.mongodb.variant.stats.MongoDBVariantStatsCalculator.apply(MongoDBVariantStatsCalculator.java:70)
    at org.opencb.opencga.storage.mongodb.variant.stats.MongoDBVariantStatsCalculator.apply(MongoDBVariantStatsCalculator.java:57)
    at org.opencb.commons.run.Task$1.apply(Task.java:48)
    at org.opencb.commons.run.Task$1.apply(Task.java:48)
    at org.opencb.commons.run.ParallelTaskRunner$TaskRunnable.call(ParallelTaskRunner.java:627)
    at org.opencb.commons.run.ParallelTaskRunner$TaskRunnable.call(ParallelTaskRunner.java:594)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:748)
org.opencb.opencga.storage.core.exceptions.StorageEngineException: Error calculating statistics.
    at org.opencb.opencga.storage.core.manager.variant.operations.VariantStatsStorageOperation.calculateStats(VariantStatsStorageOperation.java:158)
    at org.opencb.opencga.storage.core.manager.variant.VariantStorageManager.stats(VariantStorageManager.java:313)
    at org.opencb.opencga.app.cli.analysis.executors.VariantCommandExecutor.stats(VariantCommandExecutor.java:368)
    at org.opencb.opencga.app.cli.analysis.executors.VariantCommandExecutor.execute(VariantCommandExecutor.java:125)
    at org.opencb.opencga.app.cli.analysis.AnalysisMain.privateMain(AnalysisMain.java:102)
    at org.opencb.opencga.app.cli.analysis.AnalysisMain.main(AnalysisMain.java:35)
Caused by: org.opencb.opencga.storage.core.exceptions.StorageEngineException: Unable to calculate statistics.
    at org.opencb.opencga.storage.mongodb.variant.stats.MongoDBVariantStatisticsManager.createStats(MongoDBVariantStatisticsManager.java:123)
    at org.opencb.opencga.storage.core.variant.stats.DefaultVariantStatisticsManager.createStats(DefaultVariantStatisticsManager.java:156)
    at org.opencb.opencga.storage.core.variant.stats.DefaultVariantStatisticsManager.calculateStatistics(DefaultVariantStatisticsManager.java:104)
    at org.opencb.opencga.storage.core.variant.VariantStorageEngine.calculateStats(VariantStorageEngine.java:511)
    at org.opencb.opencga.storage.core.manager.variant.operations.VariantStatsStorageOperation.calculateStats(VariantStatsStorageOperation.java:123)
    ... 5 more
Caused by: java.util.concurrent.ExecutionException: Error while running ParallelTaskRunner. Found 3 exceptions.
    at org.opencb.commons.run.ParallelTaskRunner.run(ParallelTaskRunner.java:438)
    at org.opencb.commons.run.ParallelTaskRunner.run(ParallelTaskRunner.java:297)
    at org.opencb.opencga.storage.mongodb.variant.stats.MongoDBVariantStatisticsManager.createStats(MongoDBVariantStatisticsManager.java:120)
    ... 9 more
Caused by: java.lang.NullPointerException
    at org.opencb.opencga.storage.mongodb.variant.stats.MongoDBVariantStatsCalculator.calculateStats(MongoDBVariantStatsCalculator.java:83)
    at org.opencb.opencga.storage.mongodb.variant.stats.MongoDBVariantStatsCalculator.apply(MongoDBVariantStatsCalculator.java:70)
    at org.opencb.opencga.storage.mongodb.variant.stats.MongoDBVariantStatsCalculator.apply(MongoDBVariantStatsCalculator.java:57)
    at org.opencb.commons.run.Task$1.apply(Task.java:48)
    at org.opencb.commons.run.Task$1.apply(Task.java:48)
    at org.opencb.commons.run.ParallelTaskRunner$TaskRunnable.call(ParallelTaskRunner.java:627)
    at org.opencb.commons.run.ParallelTaskRunner$TaskRunnable.call(ParallelTaskRunner.java:594)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:748)
Mohammedhusen commented 4 years ago

Hi Team,

Could you please update on this. We really appreciate your help.

j-coll commented 4 years ago

Hi @Mohammedhusen , thanks for the report, I'll take a look.

Could you please confirm the opencga version by running opencga.sh --version ?

Mohammedhusen commented 4 years ago

Version 1.4.1 Git version: master dad91554c874e14023d7038644d1257d8d5c04f4

j-coll commented 4 years ago

Hi @Mohammedhusen,

Which params did you use to index the VCF files? Can you share the command line used for it? How many files and samples did you load in the study?

Mohammedhusen commented 4 years ago

Hi j-coll,

Please find my comments below.

Which params did you use to index the VCF files? ./opencga.sh variant index --file filename.gz --transform -o Output -s studyname Can you share the command line used for it? ./opencga.sh variant index --file filename.gz --load -o Output -s studyname How many files and samples did you load in the study?
6K+

j-coll commented 4 years ago

Does any of your files have no GT field for samples?

Mohammedhusen commented 4 years ago

it is multisample VCF file containing 6K samples and it has GT field for all.