Running out of Java Heap space in one of two cohorts.

sinnweja commented 2 months ago

I have a cohort of 4200 subjects that I was able to successfully run with 100G heap space on each chromosome in under 30 minutes each. Ancestry percentages match full-genome expected ancestry of about 60% EUR, 30% AMR, 10% mixed from 2 other ancestries. I have a different ancestry cohort (mix of EUR+AFR) of 10K subjects that runs out of 400G of heap space on the smaller chromosomes. I used gt-samples option to only run flare on 3000 subjects. Still runs out of 400G of heap space. I pre-subset the sample vcf file to 1000 samples and no longer need gt-samples, and runs out of 250G of heap space. Any idea what is causing the heap space to blow up on the second cohort?

browning-lab commented 2 months ago

Based on the information in your email, I don't know why you are seeing this difference in memory use. I assume that the number of ancestries in the second analysis is less than or equal to the number of ancestries in the first analysis.

If you partition the target samples and analyze them in separate analyses, you can obtain the same results that you would have obtained by analyzing all samples in a single analysis if you specify em=false and use the same model, seed, and nthreads parameters for each analysis (see https://github.com/browning-lab/flare?tab=readme-ov-file#running-flare-with-small-or-large-sample-sizes ).

On Thu, Jun 27, 2024 at 6:48 AM Jason Sinnwell @.***> wrote:

I have a cohort of 4200 subjects that I was able to successfully run with 100G heap space on each chromosome in under 30 minutes each. Ancestry percentages match full-genome expected ancestry of about 60% EUR, 30% AMR, 10% mixed from 2 other ancestries. I have a different ancestry cohort (mix of EUR+AFR) of 10K subjects that runs out of 400G of heap space on the smaller chromosomes. I used gt-samples option to only run flare on 3000 subjects. Still runs out of 400G of heap space. I pre-subset the sample vcf file to 1000 samples and no longer need gt-samples, and runs out of 250G of heap space. Any idea what is causing the heap space to blow up on the second cohort?

— Reply to this email directly, view it on GitHub https://urldefense.com/v3/__https://github.com/browning-lab/flare/issues/16__;!!K-Hz7m0Vt54!lCD2_BYNbmTqCKV3hkxsoIriZFsYIJazONPvzKt-3wn_wo_tsuehZZKsCaLCIXLRJ0strSD5L9uK3Av7hJT17dDwze7nzpA$, or unsubscribe https://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/AKDWBPCJACB54ACIKTGFHO3ZJQJ2DAVCNFSM6AAAAABJ77I466VHI2DSMVQWIX3LMV43ASLTON2WKOZSGM3TQMRRGA3TQMI__;!!K-Hz7m0Vt54!lCD2_BYNbmTqCKV3hkxsoIriZFsYIJazONPvzKt-3wn_wo_tsuehZZKsCaLCIXLRJ0strSD5L9uK3Av7hJT17dDwwi2nxfw$ . You are receiving this because you are subscribed to this thread.Message ID: @.***>

sinnweja commented 2 months ago

Thank you for your reply. The number of ancestries for the second analysis I have tried with 2, 3, and 5 ancestries, with all those runs giving heap space errors. We are familiar with running with the em=false and the model file, seed, etc, with great results. We had hoped to let flare estimate the ancestries independently on each chromosome, but we are fine using the model file with the ancestries provided from the overall estimates. I will be sure to use sufficiently large random partitions of the cohort, because some small subsets may not match the overall cohort ancestral mixture that we specify in the model file.

-Jason

sinnweja commented 2 months ago

Turns out my problems were caused by having the ref-panel file specified incorrectly. I was effectively having flare estimate many more than 5 ancestries. It is running fine now on all 10K subjects in cohort 2, with 5 ancestries. I apologize for the confusion!

browning-lab commented 2 months ago

No problem. I'm glad the problem is solved. Thank you for letting me know.

On Fri, Jun 28, 2024 at 8:44 AM Jason Sinnwell @.***> wrote:

Turns out my problems were caused by having the ref-panel file specified incorrectly. I was effectively having flare estimate many more than 5 ancestries. It is running fine now on all 10K subjects in cohort 2, with 5 ancestries. I apologize for the confusion!

— Reply to this email directly, view it on GitHub https://urldefense.com/v3/__https://github.com/browning-lab/flare/issues/16*issuecomment-2197206819__;Iw!!K-Hz7m0Vt54!nwJBp7zZ7tdTdxNfzyL_gEi-hCZbDgrkZphRD5HejWrPL1XSdUGnxV6x1047snkOXtRa-9U3lHuPJ34pVOtKeeKFG2Daimw$, or unsubscribe https://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/AKDWBPEB3IBARHQYRILDRBTZJWAFJAVCNFSM6AAAAABJ77I466VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCOJXGIYDMOBRHE__;!!K-Hz7m0Vt54!nwJBp7zZ7tdTdxNfzyL_gEi-hCZbDgrkZphRD5HejWrPL1XSdUGnxV6x1047snkOXtRa-9U3lHuPJ34pVOtKeeKFyKcpKa8$ . You are receiving this because you commented.Message ID: @.***>

browning-lab / flare

Running out of Java Heap space in one of two cohorts. #16