WansonChoi / CookHLA

An accurate and efficient HLA imputation method.
25 stars 9 forks source link

can not get alleles result but no error #17

Open xingejun opened 2 years ago

xingejun commented 2 years ago

Hi @WansonChoi ,

I am running CookHLA with a target data (N larger than 50000) and the 1000G reference data in your software(N=504). Everything went well without an error but no results were achieved. So I am wandering if the strange issue came up because of my sample is bigger than the example in your github from which I used the parameter of "mem"(2g) and "window"(5). The imputation log is as follows:

respri.hg19.hla.MHC.QC.exon2.0.5.raw_imputation_out.log

I will be very grateful if you can reply!

Thanks, Guo

xingejun commented 2 years ago

Hi @WansonChoi ,

I changed my environment and the error can be seen.

The error message always occurs as follows:

ERROR: java.lang.OutOfMemoryError: GC overhead limit exceeded

And my commond is as follows:

python CookHLA.py -i ./va_2.hla.bed -hg 19 -o ./va_2.hg19.hla -ref 1000G_REF/1000G_REF.EUR.chr6.hg18.29mb-34mb.inT1DGC
-gm ./MyAGM/hla.hg19.mach_step.avg.clpsB -ae ./MyAGM/hla.hg19.aver.erate -nth 24 Do you have any suggestions for me?

I will be very appericiate it if you can reply! Guo

xingejun commented 2 years ago

Hi @WansonChoi ,

In order to solve the problem above, I edit my -nth to 1, but the following message occur. Could you give me some advice? Thank you very much!

Command line: java -Xmx1917m -jar beagle.24Aug19.3e8.jar gt=./va_1_20_60.hg19.hla.MHC.QC.vcf ref=./va_1_20_60/1000G_REF.EUR.chr6.hg18.29mb-34mb.inT1DGC.exon2.phased.vcf out=./va_1_20_60/va_1_20_60.hg19.hla.MHC.QC.exon2.0.5.raw_imputation_out impute=true gp=true overlap=0.5 err=0.0033310878243513 map=./vaccine_1_20_60/hla.hg19.mach_step.avg.clpsB.exon2.txt window=5 ne=10000 nthreads=1

Reference samples: 503 Study samples: 55,712 Window 1 (6:29602876-31520492)

Reference markers: 3,397 Study markers: 1,564

Burnin iteration 1: 13 minutes 4 seconds Burnin iteration 2: 12 minutes 9 seconds Burnin iteration 3: 11 minutes 46 seconds Burnin iteration 4: 11 minutes 28 seconds Burnin iteration 5: 11 minutes 19 seconds Burnin iteration 6: 12 minutes 18 seconds

Phasing iteration 1: 11 minutes 50 seconds Phasing iteration 2: 10 minutes 37 seconds Phasing iteration 3: 10 minutes 4 seconds Phasing iteration 4: 9 minutes 48 seconds Phasing iteration 5: 9 minutes 27 seconds Phasing iteration 6: 8 minutes 43 seconds Phasing iteration 7: 7 minutes 56 seconds Phasing iteration 8: 7 minutes 17 seconds Phasing iteration 9: 6 minutes 16 seconds Phasing iteration 10: 19 minutes 51 seconds Phasing iteration 11: 3 minutes 54 seconds Phasing iteration 12: 2 minutes 34 seconds Exception in thread "main" java.lang.NullPointerException at imp.RefHapHash.i2hap(RefHapHash.java:156) at imp.RefHapHash.(RefHapHash.java:82) at imp.ImputedVcfWriter.appendRecords(ImputedVcfWriter.java:110) at main.WindowWriter.toByteArray(WindowWriter.java:147) at main.WindowWriter.lambda$printImputed$0(WindowWriter.java:136) at java.util.stream.IntPipeline$4$1.accept(IntPipeline.java:250) at java.util.stream.Streams$RangeIntSpliterator.forEachRemaining(Streams.java:110) at java.util.Spliterator$OfInt.forEachRemaining(Spliterator.java:693) at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481) at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471) at java.util.stream.Nodes$SizedCollectorTask.compute(Nodes.java:1878) at java.util.concurrent.CountedCompleter.exec(CountedCompleter.java:731) at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289) at java.util.concurrent.ForkJoinTask.doInvoke(ForkJoinTask.java:401) at java.util.concurrent.ForkJoinTask.invoke(ForkJoinTask.java:734) at java.util.stream.Nodes.collect(Nodes.java:325) at java.util.stream.ReferencePipeline.evaluateToNode(ReferencePipeline.java:109) at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:540) at java.util.stream.AbstractPipeline.evaluateToArrayNode(AbstractPipeline.java:260) at java.util.stream.ReferencePipeline.toArray(ReferencePipeline.java:438) at main.WindowWriter.printImputed(WindowWriter.java:137) at main.Main.printOutput(Main.java:193) at main.Main.phaseData(Main.java:163) at main.Main.main(Main.java:114)

Guo

WansonChoi commented 2 years ago

@xingejun

Hi, xingejun. Thank you for your interest in CookHLA.

It seems you allocated not enough memory to each imputation. Could you try the last command with the '-mem' argument?

Because you said that # of your target data is >50k and this is quite large, you try implementing CookHLA serially first, i.e. not with multiprocessing('-mp' argument).

I think this issue(https://github.com/WansonChoi/CookHLA/issues/14#issuecomment-1132378985) is similar to your case and might be helpful to you.

xingejun commented 2 years ago

Hi @WansonChoi ,

This problem has been solved with the suggestions you gave. Thank you very much.

Xin