Closed Deeeeen closed 2 years ago
On the first one, something has gone wrong above where the error is printed, with the function per_core_get_results
, which gets the final imputed results chunk-by-chunk (e.g. it writes your final VCF in blocks with about the same number of SNPs in each). Are you sure it's not memory? One thing you can do is decrease outputSNPBlockSize
to decrease the number of SNPs in each block as it's written to disk. Both of the results seem to be afflicted by the same problem. So e.g. set outputSNPBlockSize=2000
or smaller.
For the second one, how much memory do you have? That doesn't sound like such a tough thing to impute. I would recommend to use a higher niterations if you're not using reference haplotypes.
One thing, I wonder if the output_haplotype_dosages is using more RAM than I would have thought. You could test running it with and without that option to see what the RAM influence is. Perhaps that's the sort of thing I can bring the RAM down on my end
Hi Robbie,
I actually have 2 questions:
1.
nCore
> 1 causing problemSomething kinda strange happened when I increase the number of
nCore
for STITCH. If I keepnCore=1
, everything seems to work just fine. However, when I increase thenCore
parameter, it gives me error (I'm sure I have requested sufficient processors from the cluster when I increase thenCore
). Error message is below:Running with nCore=12
Running with nCore=2
Do you have any thoughts here?
2. Memory issue with large sample size.
Now I run STITCH with small chromosome chunks (7Mb minimum bin size and minimum number of 10k SNPs). It looks like it still have running out of memory issue. Do you have any suggestions here?
How I run STITCH 1.6.6 (I kept other parameters as default): Sample size: ~6000 rats Number of ballelic SNPs in the region: ~46000