rwdavies / STITCH

STITCH - Sequencing To Imputation Through Constructing Haplotypes
http://www.nature.com/ng/journal/v48/n8/abs/ng.3594.html
GNU General Public License v3.0
76 stars 17 forks source link

nCore causes problem #64

Closed Deeeeen closed 2 years ago

Deeeeen commented 2 years ago

Hi Robbie,

I actually have 2 questions:

1. nCore > 1 causing problem

Something kinda strange happened when I increase the number of nCore for STITCH. If I keep nCore=1, everything seems to work just fine. However, when I increase the nCore parameter, it gives me error (I'm sure I have requested sufficient processors from the cluster when I increase the nCore). Error message is below:

Running with nCore=12

[2022-02-08 00:40:43] Making output piece 2 / 5 Error in [<-.data.frame(*tmp*, , 9 + sampleRange[1]:sampleRange[2], : new columns would leave holes after existing columns Calls: STITCH ... make_and_write_output_file -> [<- -> [<-.data.frame In addition: Warning message: In mclapply(sampleRanges, mc.cores = nCores, B_bit_prob = B_bit_prob, : scheduled cores 1, 2, 3, 4, 5, 6, 7, 8, 11 did not deliver results, all values of the jobs will be affected Execution halted

Running with nCore=2

[2022-02-08 13:40:21] Making output piece 2 / 5
Error in infoCount[, 1] : incorrect number of dimensions In addition: Warning message: In mclapply(sampleRanges, mc.cores = nCores, B_bit_prob = B_bit_prob, : scheduled core 1 did not deliver a result, all values of the job will be affected

Do you have any thoughts here?

2. Memory issue with large sample size.

Now I run STITCH with small chromosome chunks (7Mb minimum bin size and minimum number of 10k SNPs). It looks like it still have running out of memory issue. Do you have any suggestions here?

How I run STITCH 1.6.6 (I kept other parameters as default): Sample size: ~6000 rats Number of ballelic SNPs in the region: ~46000

STITCH(
                regionStart = 747,
                  regionEnd = 7069776,
                     buffer = 1e+06,
                     method = "diploid",
                  outputdir = outputdir,
                        chr = "chr1",
                    posfile = posfile,
                    bamlist = bamlist,
           sampleNames_file = sampleName,
   reference_haplotype_file = "",
      reference_legend_file = "",
                          K = 8,
                niterations = 2,
 shuffleHaplotypeIterations = NA,
           refillIterations = NA,
                    tempdir = tempdir,
                     nCores = 12,
                       nGen = 100,
   output_haplotype_dosages = TRUE
)
rwdavies commented 2 years ago

On the first one, something has gone wrong above where the error is printed, with the function per_core_get_results, which gets the final imputed results chunk-by-chunk (e.g. it writes your final VCF in blocks with about the same number of SNPs in each). Are you sure it's not memory? One thing you can do is decrease outputSNPBlockSize to decrease the number of SNPs in each block as it's written to disk. Both of the results seem to be afflicted by the same problem. So e.g. set outputSNPBlockSize=2000 or smaller.

For the second one, how much memory do you have? That doesn't sound like such a tough thing to impute. I would recommend to use a higher niterations if you're not using reference haplotypes.

One thing, I wonder if the output_haplotype_dosages is using more RAM than I would have thought. You could test running it with and without that option to see what the RAM influence is. Perhaps that's the sort of thing I can bring the RAM down on my end