rwdavies / STITCH

STITCH - Sequencing To Imputation Through Constructing Haplotypes
http://www.nature.com/ng/journal/v48/n8/abs/ng.3594.html
GNU General Public License v3.0
76 stars 17 forks source link

Crash when many samples have no informative reads #102

Open TedBrookings opened 3 weeks ago

TedBrookings commented 3 weeks ago

I'm working in a genome with a large number of small scaffolds. In a cohort with 767 samples, on one of the scaffolds 106 of the samples have no informative reads. This results in a crash (log below). I'm wondering if it would be possible to add an option to exclude and assign NOCALL GTs to samples with this problem. The alternative seems to be detecting this myself, making a VCF of all no-calls (e.g. with pysam) and merging them.

Run log ``` + STITCH.R --chr=Scaffold05208 --regionStart=64 --regionEnd=762 --buffer=1000000 --cramlist crams.list \ --posfile=tmp_region_positions.pos --K=10 --nGen=20 --nCores=64 --refillIterations=NA --downsampleToCov=50 \ --outputdir=. --splitReadIterations=NA --reference ref.dna.fa --tempdir . --keepSampleReadsInRAM=TRUE \ --output_filename stitch.region-03721.vcf.gz [2024-09-16 23:12:02] Running STITCH(chr = Scaffold05208, nGen = 20, posfile = tmp_region_positions.pos, K = 10, S = 1, outputdir = ., nStarts = , tempdir = ., bamlist = , cramlist = crams.list, sampleNames_file = , reference = ref.dna.fa, genfile = , method = diploid, output_format = bgvcf, B_bit_prob = 16, outputInputInVCFFormat = FALSE, downsampleToCov = 50, downsampleFraction = 1, readAware = TRUE, chrStart = NA, chrEnd = NA, regionStart = 64, regionEnd = 762, buffer = 1000000, maxDifferenceBetweenReads = 1000, maxEmissionMatrixDifference = 1e+10, alphaMatThreshold = 1e-04, emissionThreshold = 1e-04, iSizeUpperLimit = 600, bqFilter = 17, niterations = 40, shuffleHaplotypeIterations = c(4, 8, 12, 16), splitReadIterations = NA, nCores = 64, expRate = 0.5, maxRate = 100, minRate = 0.1, Jmax = 1000, regenerateInput = TRUE, originalRegionName = NA, keepInterimFiles = FALSE, keepTempDir = FALSE, outputHaplotypeProbabilities = FALSE, switchModelIteration = NA, generateInputOnly = FALSE, restartIterations = NA, refillIterations = NA, downsampleSamples = 1, downsampleSamplesKeepList = NA, subsetSNPsfile = NA, useSoftClippedBases = FALSE, outputBlockSize = 1000, outputSNPBlockSize = 10000, inputBundleBlockSize = NA, genetic_map_file = , reference_haplotype_file = , reference_legend_file = , reference_sample_file = , reference_populations = NA, reference_phred = 20, reference_iterations = 40, reference_shuffleHaplotypeIterations = c(4, 8, 12, 16), output_filename = stitch.region-03721.vcf.gz, initial_min_hapProb = 0.2, initial_max_hapProb = 0.8, regenerateInputWithDefaultValues = FALSE, plotHapSumDuringIterations = FALSE, plot_shuffle_haplotype_attempts = FALSE, plotAfterImputation = TRUE, save_sampleReadsInfo = FALSE, gridWindowSize = NA, shuffle_bin_nSNPs = NULL, shuffle_bin_radius = 5000, keepSampleReadsInRAM = TRUE, useTempdirWhileWriting = FALSE, output_haplotype_dosages = FALSE, use_bx_tag = TRUE, bxTagUpperLimit = 50000) [2024-09-16 23:12:02] Program start [2024-09-16 23:12:02] Get and validate pos and gen [2024-09-16 23:12:02] Done get and validate pos and gen [2024-09-16 23:12:02] There are 0 variants in the left buffer region -999936 <= position < 64 [2024-09-16 23:12:02] There are 106 variants in the central region 64 <= position <= 762 [2024-09-16 23:12:02] There are 0 variants in the right buffer region 762 < position <= 1000762 [2024-09-16 23:12:02] Get CRAM sample names [2024-09-16 23:12:06] Done getting CRAM sample names [2024-09-16 23:12:06] Generate inputs [2024-09-16 23:12:06] Load and convert CRAM 600 of 767 [2024-09-16 23:12:06] WARNING - sample S.566812105_S.566812105 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:06] WARNING - sample S.567712412_S.567712412 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:06] WARNING - sample S.567512357_S.567512357 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:07] WARNING - sample S.567312269_S.567312269 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:07] WARNING - sample S.567412325_S.567412325 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:07] WARNING - sample S.568312628_S.568312628 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:07] WARNING - sample S.567212244_S.567212244 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:08] WARNING - sample S.566612041_S.566612041 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:08] WARNING - sample S.568412643_S.568412643 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:08] WARNING - sample S.568312619_S.568312619 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:08] WARNING - sample S.568812799_S.568812799 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:09] WARNING - sample S.566712069_S.566712069 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:09] WARNING - sample S.567012167_S.567012167 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:09] WARNING - sample S.567312273_S.567312273 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:10] WARNING - sample S.566712078_S.566712078 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:10] WARNING - sample S.567912501_S.567912501 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:10] WARNING - sample S.568112542_S.568112542 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:10] WARNING - sample S.568412654_S.568412654 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:10] WARNING - sample S.567712427_S.567712427 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:11] WARNING - sample S.568212572_S.568212572 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:11] WARNING - sample S.566812120_S.566812120 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:11] WARNING - sample S.568512679_S.568512679 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:11] Load and convert CRAM 100 of 767 [2024-09-16 23:12:11] WARNING - sample S.566612045_S.566612045 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:11] WARNING - sample S.568312621_S.568312621 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:12] WARNING - sample S.568612727_S.568612727 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:12] WARNING - sample S.568812791_S.568812791 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:14] WARNING - sample S.566912136_S.566912136 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:15] WARNING - sample S.568912824_S.568912824 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:15] WARNING - sample S.567012184_S.567012184 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:15] WARNING - sample S.568712754_S.568712754 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:16] WARNING - sample S.568412661_S.568412661 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:16] WARNING - sample S.568112552_S.568112552 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:17] WARNING - sample S.568812802_S.568812802 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:17] WARNING - sample S.567912496_S.567912496 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:17] WARNING - sample S.567712416_S.567712416 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:17] WARNING - sample S.566712073_S.566712073 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:17] Load and convert CRAM 400 of 767 [2024-09-16 23:12:18] WARNING - sample S.567012185_S.567012185 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:18] WARNING - sample S.568812770_S.568812770 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:18] WARNING - sample S.567012189_S.567012189 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:19] WARNING - sample S.566712072_S.566712072 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:19] WARNING - sample S.567612390_S.567612390 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:20] WARNING - sample S.567512356_S.567512356 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:20] WARNING - sample S.568512667_S.568512667 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:21] WARNING - sample S.568712744_S.568712744 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:21] WARNING - sample S.569012836_S.569012836 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:21] WARNING - sample S.566912135_S.566912135 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:21] Load and convert CRAM 700 of 767 [2024-09-16 23:12:21] WARNING - sample S.568312611_S.568312611 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:21] WARNING - sample S.567612380_S.567612380 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:21] WARNING - sample S.567312290_S.567312290 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:22] WARNING - sample S.568212587_S.568212587 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:22] WARNING - sample S.567712431_S.567712431 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:22] WARNING - sample S.568912804_S.568912804 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:22] WARNING - sample S.567312286_S.567312286 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:23] WARNING - sample S.567312278_S.567312278 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:23] WARNING - sample S.567912473_S.567912473 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:23] WARNING - sample S.567012193_S.567012193 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:23] WARNING - sample S.566612053_S.566612053 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:24] WARNING - sample S.568312633_S.568312633 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:24] WARNING - sample S.568012520_S.568012520 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:24] WARNING - sample S.568212581_S.568212581 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:24] WARNING - sample S.567112207_S.567112207 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:25] WARNING - sample S.568112547_S.568112547 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:25] WARNING - sample S.567012199_S.567012199 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:25] WARNING - sample S.568412647_S.568412647 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:25] WARNING - sample S.569012834_S.569012834 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:25] WARNING - sample S.568112557_S.568112557 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:25] Load and convert CRAM 200 of 767 [2024-09-16 23:12:26] WARNING - sample S.568012517_S.568012517 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:27] WARNING - sample S.568912828_S.568912828 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:27] WARNING - sample S.568912825_S.568912825 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:27] WARNING - sample S.568212577_S.568212577 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:27] WARNING - sample S.568912805_S.568912805 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:27] WARNING - sample S.567112208_S.567112208 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:28] WARNING - sample S.567912495_S.567912495 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:28] WARNING - sample S.566411977_S.566411977 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:28] Load and convert CRAM 500 of 767 [2024-09-16 23:12:28] WARNING - sample S.567712423_S.567712423 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:28] WARNING - sample S.567012169_S.567012169 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:28] WARNING - sample S.568512687_S.568512687 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:28] WARNING - sample S.566411964_S.566411964 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:29] WARNING - sample S.568112554_S.568112554 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:29] WARNING - sample S.567112209_S.567112209 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:29] WARNING - sample S.567312285_S.567312285 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:30] WARNING - sample S.567012194_S.567012194 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:30] WARNING - sample S.566712076_S.566712076 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:30] WARNING - sample S.567912464_S.567912464 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:30] WARNING - sample S.566812123_S.566812123 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:31] WARNING - sample S.567612382_S.567612382 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:31] WARNING - sample S.567112223_S.567112223 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:31] WARNING - sample S.567012165_S.567012165 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:31] WARNING - sample S.568312614_S.568312614 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:32] WARNING - sample S.568212575_S.568212575 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:32] WARNING - sample S.567312293_S.567312293 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:33] WARNING - sample S.568812795_S.568812795 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:33] WARNING - sample S.569012835_S.569012835 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:35] WARNING - sample S.568512664_S.568512664 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:35] WARNING - sample S.567612375_S.567612375 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:35] WARNING - sample S.568812779_S.568812779 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:37] WARNING - sample S.569012844_S.569012844 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:37] WARNING - sample S.568512684_S.568512684 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:38] WARNING - sample S.567712403_S.567712403 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:38] Load and convert CRAM 300 of 767 [2024-09-16 23:12:38] WARNING - sample S.568012527_S.568012527 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:38] WARNING - sample S.568512695_S.568512695 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:38] WARNING - sample S.566912160_S.566912160 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:38] WARNING - sample S.567312287_S.567312287 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:39] WARNING - sample S.568612729_S.568612729 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:39] WARNING - sample S.568212582_S.568212582 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:39] WARNING - sample S.568912832_S.568912832 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-16 23:12:41] Done generating inputs [2024-09-16 23:12:41] Copying files onto tempdir [2024-09-16 23:15:35] Done copying files onto tempdir [2024-09-16 23:15:35] Begin loading all sample reads into memory [2024-09-16 23:15:36] Error in readChar(con, 5L, useBytes = TRUE) : cannot open the connection Error in check_mclapply_OK(out) : An error occured during STITCH. The first such error is above Calls: STITCH ... load_all_sampleReads_into_memory -> check_mclapply_OK In addition: Warning messages: 1: In mclapply(1:length(sampleRanges), mc.cores = nCores, FUN = loadBamAndConvert_across_a_range, : scheduled core 10 encountered error in user code, all values of the job will be affected 2: In mclapply(sampleRanges, mc.cores = nCores, FUN = function(sampleRange) { : scheduled core 10 encountered error in user code, all values of the job will be affected Execution halted NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL ```
Zilong-Li commented 3 weeks ago

Hey,

is this scalffold of 700 bases long ? Given the command --chr=Scaffold05208 --regionStart=64 --regionEnd=762, I don't think STITCH can work

TedBrookings commented 3 weeks ago

The scaffold is very short, but this command is running in a workflow that chops up regions by where there are SNPs, so the scaffold is slightly longer (867 bases). Is there some rule of thumb I can use to understand when/why STITCH will not be able to work on small scaffolds or short regions?

Zilong-Li commented 3 weeks ago

STITCH should be able to generate non-called GTs. I suppose this error may be due to the parallel function. Try to reduce the number of cores to 1.

rwdavies commented 3 weeks ago

So my solution for this was never very elegant. The code is here to make fake sample reads when the sample has no reads that intersect SNPs in the region of interest https://github.com/rwdavies/STITCH/blob/f579464cff8380afcfb62804f0cb9a4ff5dabe67/STITCH/R/functions.R#L2039C1-L2039C22 It's not actually random noise that these samples are given, t's just very weak signal for the first three SNPs only. In any decently sized region, it should be almost no information. I suppose I could give it a read that intersects the first SNP only and shows equal opposite evidence for the reference and alternate allele. That would be smarter.

Are there perhaps 3 or fewer SNPs in this scaffolds?

Otherwise I would suggest a simple rule might be to run regions (e.g. scaffolds) with more than 5,000 bases. I don't have a strong basis for this number it's just that to make the imputation efficient with low coverage data the region should be long enough to leverage information across different samples.

If you reproducibly hit this error > 3 SNPs and > 5000 bases let us know, maybe something else is going on

TedBrookings commented 3 weeks ago

There are 106 SNPs, although many fewer than 5000 bases. I'll restrict to scaffolds with >5000 bases and see if that eliminates this kind of crash. If that works, I guess I could also try reducing to 1 core for those regions and see if they are able to run successfully.

rwdavies commented 3 weeks ago

There are 106 SNPs in (762 - 64) bases? That doesn't obviously explain any error but that is a super high heterozygosity if these are true positive SNPs

Are any of them at the start or end of the region exactly? I think I've written tests against those edge cases but wonder if something is still going on

Another thing, you could set tempdir to be something you have access to, and check after a run fails, if any of the "input" files are missing. STITCH works by first processing the BAMs to get a representation of the data in an internal folder stored in the "input" folder, whose names should correpond in an obvious way to the samples being imputed

TedBrookings commented 3 weeks ago

It's a plant genome, so I wouldn't be shocked if some of this was just bad reference. The regions are chosen by a program that breaks them up by SNPs, so yes, first and last base have a SNP in them. I can put a buffer in though so that isn't the case in the future. (I am running with a large value for --buffer).

TedBrookings commented 2 weeks ago

A quick follow up: I restricted to regions > 5000bp and more than 6 SNPs, but I am still seeing crashes when many of the CRAMs have no informative reads in an area. The log below shows a crash in a region with 241 variants in 7025 bp:

log + STITCH.R --chr=Scaffold01615 --regionStart=69 --regionEnd=7094 --buffer=1000000 --cramlist tmp_cramlist --posfile=tmp_region_positions.pos --K=10 --nGen=20 --nCores=64 --refillIterations=NA --downsampleToCov=50 --outputdir=. --splitReadIterations=NA --reference ref.fa --tempdir . --keepSampleReadsInRAM=FALSE --outputSNPBlockSize 1000 --gridWindowSize 1000 --output_filename stitch.region-871.vcf.gz [2024-09-21 02:43:15] Running STITCH(chr = Scaffold01615, nGen = 20, posfile = tmp_region_positions.pos, K = 10, S = 1, outputdir = ., nStarts = , tempdir = ., bamlist = , cramlist = tmp_cramlist, sampleNames_file = , reference = Brassica_oleracea.BOL.dna.toplevel.fa, genfile = , method = diploid, output_format = bgvcf, B_bit_prob = 16, outputInputInVCFFormat = FALSE, downsampleToCov = 50, downsampleFraction = 1, readAware = TRUE, chrStart = NA, chrEnd = NA, regionStart = 69, regionEnd = 7094, buffer = 1000000, maxDifferenceBetweenReads = 1000, maxEmissionMatrixDifference = 1e+10, alphaMatThreshold = 1e-04, emissionThreshold = 1e-04, iSizeUpperLimit = 600, bqFilter = 17, niterations = 40, shuffleHaplotypeIterations = c(4, 8, 12, 16), splitReadIterations = NA, nCores = 64, expRate = 0.5, maxRate = 100, minRate = 0.1, Jmax = 1000, regenerateInput = TRUE, originalRegionName = NA, keepInterimFiles = FALSE, keepTempDir = FALSE, outputHaplotypeProbabilities = FALSE, switchModelIteration = NA, generateInputOnly = FALSE, restartIterations = NA, refillIterations = NA, downsampleSamples = 1, downsampleSamplesKeepList = NA, subsetSNPsfile = NA, useSoftClippedBases = FALSE, outputBlockSize = 1000, outputSNPBlockSize = 1000, inputBundleBlockSize = NA, genetic_map_file = , reference_haplotype_file = , reference_legend_file = , reference_sample_file = , reference_populations = NA, reference_phred = 20, reference_iterations = 40, reference_shuffleHaplotypeIterations = c(4, 8, 12, 16), output_filename = stitch.region-871.vcf.gz, initial_min_hapProb = 0.2, initial_max_hapProb = 0.8, regenerateInputWithDefaultValues = FALSE, plotHapSumDuringIterations = FALSE, plot_shuffle_haplotype_attempts = FALSE, plotAfterImputation = TRUE, save_sampleReadsInfo = FALSE, gridWindowSize = 1000, shuffle_bin_nSNPs = NULL, shuffle_bin_radius = 5000, keepSampleReadsInRAM = FALSE, useTempdirWhileWriting = FALSE, output_haplotype_dosages = FALSE, use_bx_tag = TRUE, bxTagUpperLimit = 50000) [2024-09-21 02:43:15] Program start [2024-09-21 02:43:15] Get and validate pos and gen [2024-09-21 02:43:15] Done get and validate pos and gen [2024-09-21 02:43:15] There are 0 variants in the left buffer region -999931 <= position < 69 [2024-09-21 02:43:15] There are 241 variants in the central region 69 <= position <= 7094 [2024-09-21 02:43:15] There are 0 variants in the right buffer region 7094 < position <= 1007094 [2024-09-21 02:43:15] Get CRAM sample names [2024-09-21 02:43:20] Done getting CRAM sample names [2024-09-21 02:43:20] Generate inputs [2024-09-21 02:43:21] Load and convert CRAM 600 of 767 [2024-09-21 02:43:21] WARNING - sample S.566812105_S.566812105 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-21 02:43:21] WARNING - sample S.568012523_S.568012523 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-21 02:43:21] WARNING - sample S.567512357_S.567512357 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-21 02:43:21] WARNING - sample S.567412325_S.567412325 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-21 02:43:22] WARNING - sample S.568312628_S.568312628 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-21 02:43:22] WARNING - sample S.568412654_S.568412654 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-21 02:43:23] WARNING - sample S.566712069_S.566712069 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-21 02:43:24] WARNING - sample S.567312273_S.567312273 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-21 02:43:24] WARNING - sample S.568112542_S.568112542 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-21 02:43:24] WARNING - sample S.567312269_S.567312269 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-21 02:43:24] WARNING - sample S.567912497_S.567912497 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-21 02:43:25] WARNING - sample S.568812791_S.568812791 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-21 02:43:25] WARNING - sample S.566712064_S.566712064 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-21 02:43:25] WARNING - sample S.568612727_S.568612727 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-21 02:43:26] WARNING - sample S.567712427_S.567712427 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-21 02:43:26] WARNING - sample S.566411986_S.566411986 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-21 02:43:27] WARNING - sample S.568312621_S.568312621 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-21 02:43:27] WARNING - sample S.568312619_S.568312619 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-21 02:43:28] WARNING - sample S.568712746_S.568712746 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-21 02:43:30] Load and convert CRAM 100 of 767 [2024-09-21 02:43:31] WARNING - sample S.566612045_S.566612045 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-21 02:43:32] WARNING - sample S.567912494_S.567912494 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-21 02:43:33] Load and convert CRAM 700 of 767 [2024-09-21 02:43:33] WARNING - sample S.568912824_S.568912824 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-21 02:43:33] WARNING - sample S.566712073_S.566712073 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-21 02:43:33] WARNING - sample S.568612711_S.568612711 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-21 02:43:33] WARNING - sample S.568412661_S.568412661 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-21 02:43:34] WARNING - sample S.568812770_S.568812770 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-21 02:43:34] WARNING - sample S.568212601_S.568212601 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-21 02:43:34] Load and convert CRAM 400 of 767 [2024-09-21 02:43:35] WARNING - sample S.566712072_S.566712072 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-21 02:43:36] WARNING - sample S.567512356_S.567512356 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-21 02:43:36] WARNING - sample S.568512667_S.568512667 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-21 02:43:36] WARNING - sample S.568012520_S.568012520 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-21 02:43:37] WARNING - sample S.567712431_S.567712431 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-21 02:43:37] WARNING - sample S.567612390_S.567612390 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-21 02:43:37] WARNING - sample S.567312290_S.567312290 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-21 02:43:39] WARNING - sample S.568312633_S.568312633 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-21 02:43:39] WARNING - sample S.567912473_S.567912473 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-21 02:43:40] WARNING - sample S.568112557_S.568112557 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-21 02:43:40] WARNING - sample S.568012529_S.568012529 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-21 02:43:40] Load and convert CRAM 200 of 767 [2024-09-21 02:43:40] WARNING - sample S.566411977_S.566411977 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-21 02:43:41] WARNING - sample S.567312278_S.567312278 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-21 02:43:41] WARNING - sample S.567312286_S.567312286 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-21 02:43:41] WARNING - sample S.569012834_S.569012834 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-21 02:43:41] WARNING - sample S.568012517_S.568012517 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-21 02:43:43] Load and convert CRAM 500 of 767 [2024-09-21 02:43:44] WARNING - sample S.567112207_S.567112207 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-21 02:43:45] WARNING - sample S.567912465_S.567912465 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-21 02:43:46] WARNING - sample S.567512347_S.567512347 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-21 02:43:46] WARNING - sample S.567112223_S.567112223 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-21 02:43:46] WARNING - sample S.568812779_S.568812779 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-21 02:43:48] WARNING - sample S.567612375_S.567612375 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-21 02:43:48] WARNING - sample S.568912805_S.568912805 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-21 02:43:50] WARNING - sample S.569012835_S.569012835 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-21 02:43:51] WARNING - sample S.567312293_S.567312293 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-21 02:43:52] WARNING - sample S.568512684_S.568512684 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-21 02:43:52] WARNING - sample S.567312285_S.567312285 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-21 02:43:52] WARNING - sample S.567712403_S.567712403 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-21 02:43:53] WARNING - sample S.568512695_S.568512695 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-21 02:43:53] WARNING - sample S.566912160_S.566912160 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-21 02:43:54] WARNING - sample S.568412634_S.568412634 has no informative reads. It is being given random reads. Consider removing from analysis [2024-09-21 02:43:55] Load and convert CRAM 300 of 767 [2024-09-21 02:43:58] Done generating inputs [2024-09-21 02:43:58] Copying files onto tempdir [2024-09-21 02:47:05] Done copying files onto tempdir [2024-09-21 02:47:05] Generate allele count [2024-09-21 02:47:05] Error in readChar(con, 5L, useBytes = TRUE) : cannot open the connection Error in check_mclapply_OK(out2) : An error occured during STITCH. The first such error is above Calls: STITCH -> buildAlleleCount -> check_mclapply_OK In addition: Warning messages: 1: In mclapply(1:length(sampleRanges), mc.cores = nCores, FUN = loadBamAndConvert_across_a_range, : scheduled core 45 encountered error in user code, all values of the job will be affected 2: In mclapply(sampleRanges, mc.cores = nCores, FUN = buildAlleleCount_subfunction, : scheduled core 45 encountered error in user code, all values of the job will be affected Execution halted
Zilong-Li commented 2 weeks ago

Such errors arose from reading or writing operations within some of the parallel processes. For debugging:

  1. Make sure I/O is okay.
  2. Set nCores=1