Closed GACGAMA closed 11 months ago
Hi! I'm trying to run some one multisample (trio) vcf from WGS in genomizer. The files are aligned using the hg38 reference from Cavatica (Alias to broad-references/Homo_sapiens_assembly38.fasta on kfdrc-harmonization/kf_reference) - https://cavatica.sbgenomics.com/u/kfdrc-harmonization/kf-references/files/60639014357c3a53540ca7a3/
When trying to run genomizer, I get the following error:
2023-10-17 10:50:58.838 INFO 1454718 --- [ main] org.monarchinitiative.exomiser.cli.Main : Starting Main using Java 19.0.1 on login03 with PID 1454718 (/scratch4/nsobrei2/programs/genomizer/exomiser-cli-13.2.0/exomiser-cli-13.2.0.jar started by ggama1 in /scratch4/nsobrei2/programs/genomizer/exomiser-cli-13.2.0) 2023-10-17 10:50:58.841 INFO 1454718 --- [ main] org.monarchinitiative.exomiser.cli.Main : No active profile set, falling back to 1 default profile: "default" 2023-10-17 10:51:01.622 INFO 1454718 --- [ main] o.m.exomiser.cli.config.MainConfig : Exomiser home: /scratch4/nsobrei2/programs/genomizer/exomiser-cli-13.2.0 2023-10-17 10:51:01.629 INFO 1454718 --- [ main] o.m.exomiser.cli.config.MainConfig : Root data source directory set to: /scratch4/nsobrei2/programs/genomizer/exomiser-cli-13.2.0/data 2023-10-17 10:51:01.710 INFO 1454718 --- [ main] o.m.e.c.g.j.JannovarDataProtoSerialiser : Deserialising Jannovar data from /scratch4/nsobrei2/programs/genomizer/exomiser-cli-13.2.0/data/2302_hg19/2302_hg19_transcripts_ensembl.ser 2023-10-17 10:51:06.030 INFO 1454718 --- [ main] o.m.e.c.g.j.JannovarDataProtoSerialiser : Deserialisation took 4.32 sec. 2023-10-17 10:51:10.517 INFO 1454718 --- [ main] o.m.e.c.g.dao.VariantWhiteListLoader : Loading variant whitelist from: /scratch4/nsobrei2/programs/genomizer/exomiser-cli-13.2.0/data/2302_hg19/2302_hg19_clinvar_whitelist.tsv.gz 2023-10-17 10:51:11.630 INFO 1454718 --- [ main] o.m.e.c.g.dao.VariantWhiteListLoader : Loaded 180928 variants into whitelist 2023-10-17 10:51:11.832 INFO 1454718 --- [ main] o.m.e.a.genome.GenomeDataSourceLoader : Opening CADD snv data from source: /scratch4/nsobrei2/references/CADD_v1-6_HG19/whole_genome_SNVs.tsv.gz 2023-10-17 10:51:12.126 INFO 1454718 --- [ main] o.m.e.a.genome.GenomeDataSourceLoader : Opening CADD InDel data from source: /scratch4/nsobrei2/references/CADD_v1-6_HG19/InDels.tsv.gz 2023-10-17 10:51:12.415 INFO 1454718 --- [ main] o.m.e.a.genome.GenomeDataSourceLoader : Opening REMM data from source: /scratch4/nsobrei2/references/REMM/v0-4/ReMM.v0.4.hg19.tsv.gz 2023-10-17 10:51:15.916 INFO 1454718 --- [ main] o.m.e.c.g.j.JannovarDataProtoSerialiser : Deserialising Jannovar data from /scratch4/nsobrei2/programs/genomizer/exomiser-cli-13.2.0/data/2302_hg38/2302_hg38_transcripts_ensembl.ser 2023-10-17 10:51:17.222 INFO 1454718 --- [ main] o.m.e.c.g.j.JannovarDataProtoSerialiser : Deserialisation took 1.305 sec. 2023-10-17 10:51:17.830 INFO 1454718 --- [ main] o.m.e.c.g.dao.VariantWhiteListLoader : Loading variant whitelist from: /scratch4/nsobrei2/programs/genomizer/exomiser-cli-13.2.0/data/2302_hg38/2302_hg38_clinvar_whitelist.tsv.gz 2023-10-17 10:51:19.714 INFO 1454718 --- [ main] o.m.e.c.g.dao.VariantWhiteListLoader : Loaded 180991 variants into whitelist 2023-10-17 10:51:19.814 INFO 1454718 --- [ main] o.m.e.a.genome.GenomeDataSourceLoader : Opening CADD snv data from source: /scratch4/nsobrei2/references/CADD_vep_110/version_1_6/whole_genome_SNVs.tsv.gz 2023-10-17 10:51:19.911 INFO 1454718 --- [ main] o.m.e.a.genome.GenomeDataSourceLoader : Opening CADD InDel data from source: /scratch4/nsobrei2/references/CADD_vep_110/version_1_6/gnomad.genomes.r3.0.indel.tsv.gz 2023-10-17 10:51:19.945 INFO 1454718 --- [ main] o.m.e.a.genome.GenomeDataSourceLoader : Opening REMM data from source: /scratch4/nsobrei2/references/REMM/v0-4/ReMM.v0.4.hg38.tsv.gz 2023-10-17 10:51:20.524 INFO 1454718 --- [ main] g.GenomeAnalysisServiceAutoConfiguration : Configured hg19 genome analysis service 2023-10-17 10:51:20.524 INFO 1454718 --- [ main] g.GenomeAnalysisServiceAutoConfiguration : Configured hg38 genome analysis service 2023-10-17 10:51:25.309 INFO 1454718 --- [ main] o.m.exomiser.cli.config.MainConfig : Default results directory set to: /scratch4/nsobrei2/programs/genomizer/exomiser-cli-13.2.0/results 2023-10-17 10:51:25.321 INFO 1454718 --- [ main] o.m.e.a.ExomiserConfigReporter : exomiser.data-directory: /scratch4/nsobrei2/programs/genomizer/exomiser-cli-13.2.0/data 2023-10-17 10:51:25.321 INFO 1454718 --- [ main] o.m.e.a.ExomiserConfigReporter : exomiser.hg19.data-version: 2302 2023-10-17 10:51:25.321 INFO 1454718 --- [ main] o.m.e.a.ExomiserConfigReporter : exomiser.hg38.data-version: 2302 2023-10-17 10:51:25.321 INFO 1454718 --- [ main] o.m.e.a.ExomiserConfigReporter : exomiser.phenotype.data-version: 2302 2023-10-17 10:51:25.819 INFO 1454718 --- [ main] org.monarchinitiative.exomiser.cli.Main : Started Main in 28.307 seconds (JVM running for 29.296) 2023-10-17 10:51:27.913 INFO 1454718 --- [ main] o.m.e.cli.ExomiserCommandLineRunner : Exomiser running... 2023-10-17 10:51:27.929 INFO 1454718 --- [ main] o.m.exomiser.core.Exomiser : Running analysis using hg38 assembly with mode: PASS_ONLY 2023-10-17 10:51:27.931 INFO 1454718 --- [ main] o.m.e.c.analysis.AbstractAnalysisRunner : Validating sample input data 2023-10-17 10:51:28.110 INFO 1454718 --- [ main] o.m.e.c.analysis.AbstractAnalysisRunner : Running analysis for proband BH6074_1 (sample 1 in VCF) from samples: [BH6074_1, BH6074_2, BH6074_3]. Using coordinates for genome assembly hg38. 2023-10-17 10:51:28.711 INFO 1454718 --- [ main] o.m.e.c.analysis.AbstractAnalysisRunner : Filtering variants with: 2023-10-17 10:51:28.711 INFO 1454718 --- [ main] o.m.e.c.analysis.AbstractAnalysisRunner : FailedVariantFilter{} 2023-10-17 10:51:28.711 INFO 1454718 --- [ main] o.m.e.c.analysis.AbstractAnalysisRunner : VariantEffectFilter{offTargetVariantTypes=[CODING_TRANSCRIPT_INTRON_VARIANT, FIVE_PRIME_UTR_EXON_VARIANT, THREE_PRIME_UTR_EXON_VARIANT, FIVE_PRIME_UTR_INTRON_VARIANT, THREE_PRIME_UTR_INTRON_VARIANT, NON_CODING_TRANSCRIPT_EXON_VARIANT, NON_CODING_TRANSCRIPT_INTRON_VARIANT, UPSTREAM_GENE_VARIANT, DOWNSTREAM_GENE_VARIANT, INTERGENIC_VARIANT, REGULATORY_REGION_VARIANT]} 2023-10-17 10:51:28.712 INFO 1454718 --- [ main] o.m.e.c.analysis.AbstractAnalysisRunner : FrequencyFilter{maxFreq=2.0} 2023-10-17 10:51:28.712 INFO 1454718 --- [ main] o.m.e.c.analysis.AbstractAnalysisRunner : Wrapping FrequencyFilter{maxFreq=2.0} with VariantDataProvider for sources [THOUSAND_GENOMES, TOPMED, UK10K, ESP_AFRICAN_AMERICAN, ESP_EUROPEAN_AMERICAN, ESP_ALL, EXAC_AFRICAN_INC_AFRICAN_AMERICAN, EXAC_AMERICAN, EXAC_EAST_ASIAN, EXAC_FINNISH, EXAC_NON_FINNISH_EUROPEAN, EXAC_OTHER, EXAC_SOUTH_ASIAN, GNOMAD_E_AFR, GNOMAD_E_AMR, GNOMAD_E_EAS, GNOMAD_E_FIN, GNOMAD_E_NFE, GNOMAD_E_OTH, GNOMAD_E_SAS, GNOMAD_G_AFR, GNOMAD_G_AMR, GNOMAD_G_EAS, GNOMAD_G_FIN, GNOMAD_G_NFE, GNOMAD_G_OTH, GNOMAD_G_SAS] 2023-10-17 10:51:28.713 INFO 1454718 --- [ main] o.m.e.c.analysis.AbstractAnalysisRunner : PathogenicityFilter{keepNonPathogenic=true} 2023-10-17 10:51:28.713 INFO 1454718 --- [ main] o.m.e.c.analysis.AbstractAnalysisRunner : Wrapping PathogenicityFilter{keepNonPathogenic=true} with VariantDataProvider for sources [REVEL, MVP] 2023-10-17 10:51:28.714 INFO 1454718 --- [ main] o.m.e.core.genome.VariantFactoryImpl : Annotating variant records, trimming sequences and normalising positions... 2023-10-17 10:52:10.126 INFO 1454718 --- [ main] o.m.e.core.genome.VariantFactoryImpl : Processed 1155618 variant records into 71736 single allele variants (including 0 structural variants) 2023-10-17 10:52:10.127 INFO 1454718 --- [ main] o.m.e.core.genome.VariantFactoryImpl : Variant annotation finished in 0m 41s 412ms (41412 ms) 2023-10-17 10:52:10.129 INFO 1454718 --- [ main] ConditionEvaluationReportLoggingListener : Error starting ApplicationContext. To display the conditions report re-run your application with 'debug' enabled. 2023-10-17 10:52:10.214 ERROR 1454718 --- [ main] o.s.boot.SpringApplication : Application run failed java.lang.IllegalStateException: Failed to execute CommandLineRunner at org.springframework.boot.SpringApplication.callRunner(SpringApplication.java:771) at org.springframework.boot.SpringApplication.callRunners(SpringApplication.java:752) at org.springframework.boot.SpringApplication.run(SpringApplication.java:314) at org.springframework.boot.SpringApplication.run(SpringApplication.java:1303) at org.springframework.boot.SpringApplication.run(SpringApplication.java:1292) at org.monarchinitiative.exomiser.cli.Main.main(Main.java:53) Caused by: org.monarchinitiative.svart.CoordinatesOutOfBoundsException: Fully-closed coordinates 1:249064688-249064690 out of contig bounds [1,248956422] at org.monarchinitiative.svart.Coordinates.validateCoordinates(Coordinates.java:197) at org.monarchinitiative.svart.BaseGenomicRegion.<init>(BaseGenomicRegion.java:23) at org.monarchinitiative.svart.BaseVariant.<init>(BaseVariant.java:20) at org.monarchinitiative.svart.impl.DefaultVariant.<init>(DefaultVariant.java:8) at org.monarchinitiative.svart.impl.DefaultVariant.of(DefaultVariant.java:23) at org.monarchinitiative.svart.Variant.of(Variant.java:85) at org.monarchinitiative.svart.util.VcfConverter.convert(VcfConverter.java:60) at org.monarchinitiative.exomiser.core.genome.VariantContextConverter.convertToVariant(VariantContextConverter.java:106) at org.monarchinitiative.exomiser.core.genome.VariantFactoryImpl.buildVariantEvaluations(VariantFactoryImpl.java:161) at org.monarchinitiative.exomiser.core.genome.VariantFactoryImpl.lambda$buildAlleleVariantEvaluations$1(VariantFactoryImpl.java:110) at java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:197) at java.base/java.util.ArrayList$SubList$2.forEachRemaining(ArrayList.java:1481) at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:509) at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:499) at java.base/java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150) at java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173) at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) at java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:596) at java.base/java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:276) at java.base/java.util.stream.ReferencePipeline$15$1.accept(ReferencePipeline.java:541) at java.base/java.util.Iterator.forEachRemaining(Iterator.java:133) at java.base/java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1921) at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:509) at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:499) at java.base/java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:921) at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) at java.base/java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:682) at org.monarchinitiative.exomiser.core.analysis.AbstractAnalysisRunner.loadAndFilterVariants(AbstractAnalysisRunner.java:208) at org.monarchinitiative.exomiser.core.analysis.AbstractAnalysisRunner.run(AbstractAnalysisRunner.java:113) at org.monarchinitiative.exomiser.core.Exomiser.run(Exomiser.java:83) at org.monarchinitiative.exomiser.core.Exomiser.run(Exomiser.java:69) at org.monarchinitiative.exomiser.cli.ExomiserCommandLineRunner.runJob(ExomiserCommandLineRunner.java:79) at org.monarchinitiative.exomiser.cli.ExomiserCommandLineRunner.runJobs(ExomiserCommandLineRunner.java:62) at org.monarchinitiative.exomiser.cli.ExomiserCommandLineRunner.run(ExomiserCommandLineRunner.java:57) at org.springframework.boot.SpringApplication.callRunner(SpringApplication.java:768) ... 5 common frames omitted
I'm running genomizer on a HPC with oracle java 19 I'm using version 2302 of exomizer hg38 data
Was loading mixed vcf files (hg 19 + hg38) on a multisample vcf file.
Hi! I'm trying to run some one multisample (trio) vcf from WGS in genomizer. The files are aligned using the hg38 reference from Cavatica (Alias to broad-references/Homo_sapiens_assembly38.fasta on kfdrc-harmonization/kf_reference) - https://cavatica.sbgenomics.com/u/kfdrc-harmonization/kf-references/files/60639014357c3a53540ca7a3/
When trying to run genomizer, I get the following error:
I'm running genomizer on a HPC with oracle java 19 I'm using version 2302 of exomizer hg38 data