rwdavies / STITCH

STITCH - Sequencing To Imputation Through Constructing Haplotypes
http://www.nature.com/ng/journal/v48/n8/abs/ng.3594.html
GNU General Public License v3.0
73 stars 19 forks source link

A question about readChar #79

Closed Dengzhaoo closed 1 year ago

Dengzhaoo commented 1 year ago

Hi

sorry to brother you again.I run it I have this message:

Warning messages: 1: package ‘STITCH’ was built under R version 4.2.1 2: package ‘rrbgen’ was built under R version 4.2.1 [2023-02-06 11:59:24] Running STITCH(chr = 25, nGen = 125, posfile = chr25_pos.txt, K = 22, S = 1, outputdir = ./, nStarts = , tempdir = ./, bamlist = bam1.txt, cramlist = , sampleNames_file = , reference = , genfile = , method = diploid, output_format = bgvcf, B_bit_prob = 16, outputInputInVCFFormat = FALSE, downsampleToCov = 50, downsampleFraction = 1, readAware = TRUE, chrStart = NA, chrEnd = NA, regionStart = NA, regionEnd = NA, buffer = NA, maxDifferenceBetweenReads = 1000, maxEmissionMatrixDifference = 1e+10, alphaMatThreshold = 1e-04, emissionThreshold = 1e-04, iSizeUpperLimit = 600, bqFilter = 17, niterations = 40, shuffleHaplotypeIterations = c(4, 8, 12, 16), splitReadIterations = 25, nCores = 18, expRate = 0.5, maxRate = 100, minRate = 0.1, Jmax = 1000, regenerateInput = TRUE, originalRegionName = NA, keepInterimFiles = FALSE, keepTempDir = FALSE, outputHaplotypeProbabilities = FALSE, switchModelIteration = NA, generateInputOnly = FALSE, restartIterations = NA, refillIterations = c(6, 10, 14, 18), downsampleSamples = 1, downsampleSamplesKeepList = NA, subsetSNPsfile = NA, useSoftClippedBases = FALSE, outputBlockSize = 1000, outputSNPBlockSize = 10000, inputBundleBlockSize = NA, genetic_map_file = , reference_haplotype_file = , reference_legend_file = , reference_sample_file = , reference_populations = NA, reference_phred = 20, reference_iterations = 40, reference_shuffleHaplotypeIterations = c(4, 8, 12, 16), output_filename = NULL, initial_min_hapProb = 0.2, initial_max_hapProb = 0.8, regenerateInputWithDefaultValues = FALSE, plotHapSumDuringIterations = FALSE, plot_shuffle_haplotype_attempts = FALSE, plotAfterImputation = TRUE, save_sampleReadsInfo = FALSE, gridWindowSize = NA, shuffle_bin_nSNPs = NULL, shuffle_bin_radius = 5000, keepSampleReadsInRAM = FALSE, useTempdirWhileWriting = FALSE, output_haplotype_dosages = FALSE, use_bx_tag = TRUE, bxTagUpperLimit = 50000) [2023-02-06 11:59:24] Program start [2023-02-06 11:59:24] Get and validate pos and gen [2023-02-06 11:59:26] Done get and validate pos and gen [2023-02-06 11:59:26] Get BAM sample names [2023-02-06 11:59:31] Done getting BAM sample names [2023-02-06 11:59:31] Generate inputs [2023-02-06 11:59:51] downsample sample 476 - 10 of 153079 reads removed [2023-02-06 11:59:56] downsample sample 100 - 10 of 294278 reads removed [2023-02-06 12:00:11] Load and convert BAM 200 of 500 Warning: The index file is older than the data file: /media/zhangyu/955d175c-5651-4fb7-b2b5-be56f76c69ae/dengzhao_data/blackgoat/lc/stitch/markdup/173.sorted.markdup.bam.bai [2023-02-06 12:00:18] downsample sample 394 - 13 of 189521 reads removed [2023-02-06 12:00:27] downsample sample 353 - 13 of 225166 reads removed [2023-02-06 12:00:36] downsample sample 1169 - 14 of 372656 reads removed [2023-02-06 12:00:39] Load and convert BAM 400 of 500 [2023-02-06 12:00:45] downsample sample 1468 - 12 of 265190 reads removed

caught segfault address 0x37d59000, cause 'memory not mapped'

Traceback: 1: cpp_read_reassign(ord = ord, qnameInteger_ord = qnameInteger_ord, bxtagInteger_ord = bxtagInteger_ord, bxtag_bad_ord = bxtag_bad_ord, qname = qname, bxtag = bxtag, strand = strand, sampleReadsRaw = sampleReadsRaw, readStart_ord = readStart_ord, readEnd_ord = readEnd_ord, readStart = readStart, readEnd = readEnd, iSizeUpperLimit = iSizeUpperLimit, bxTagUpperLimit = bxTagUpperLimit, use_bx_tag = use_bx_tag, save_sampleReadsInfo = save_sampleReadsInfo) 2: merge_reads_from_sampleReadsRaw(sampleReadsRaw = sampleReadsRaw, qname = qname, bxtag = bxtag, strand = strand, readStart = readStart, readEnd = readEnd, iSizeUpperLimit = iSizeUpperLimit, use_bx_tag = use_bx_tag, bxTagUpperLimit = bxTagUpperLimit, save_sampleReadsInfo = save_sampleReadsInfo, qname_all = qname_all, readStart_all = readStart_all, readEnd_all = readEnd_all) 3: loadBamAndConvert(iBam = iBam, L = L, pos = pos, nSNPs = nSNPs, bam_files = bam_files, cram_files = cram_files, reference = reference, iSizeUpperLimit = iSizeUpperLimit, bqFilter = bqFilter, chr = chr, N = N, downsampleToCov = downsampleToCov, sampleNames = sampleNames, inputdir = inputdir, useSoftClippedBases = useSoftClippedBases, regionName = regionName, tempdir = tempdir, chrStart = chrStart, chrEnd = chrEnd, chrLength = chrLength, save_sampleReadsInfo = save_sampleReadsInfo, use_bx_tag = use_bx_tag, bxTagUpperLimit = bxTagUpperLimit) 4: FUN(X[[i]], ...) 5: lapply(X = S, FUN = FUN, ...) 6: doTryCatch(return(expr), name, parentenv, handler) 7: tryCatchOne(expr, names, parentenv, handlers[[1L]]) 8: tryCatchList(expr, classes, parentenv, handlers) 9: tryCatch(expr, error = function(e) { call <- conditionCall(e) if (!is.null(call)) { if (identical(call[[1L]], quote(doTryCatch))) call <- sys.call(-4L) dcall <- deparse(call, nlines = 1L) prefix <- paste("Error in", dcall, ": ") LONG <- 75L sm <- strsplit(conditionMessage(e), "\n")[[1L]] w <- 14L + nchar(dcall, type = "w") + nchar(sm[1L], type = "w") if (is.na(w)) w <- 14L + nchar(dcall, type = "b") + nchar(sm[1L], type = "b") if (w > LONG) prefix <- paste0(prefix, "\n ") } else prefix <- "Error : " msg <- paste0(prefix, conditionMessage(e), "\n") .Internal(seterrmessage(msg[1L])) if (!silent && isTRUE(getOption("show.error.messages"))) { cat(msg, file = outFile) .Internal(printDeferredWarnings()) } invisible(structure(msg, class = "try-error", condition = e))}) 10: try(lapply(X = S, FUN = FUN, ...), silent = TRUE) 11: sendMaster(try(lapply(X = S, FUN = FUN, ...), silent = TRUE)) 12: FUN(X[[i]], ...) 13: lapply(seq_len(cores), inner.do) 14: mclapply(1:length(sampleRanges), mc.cores = nCores, FUN = loadBamAndConvert_across_a_range, sampleRanges = sampleRanges, bundling_info = bundling_info, L = L, pos = pos, nSNPs = nSNPs, bam_files = bam_files, cram_files = cram_files, reference = reference, iSizeUpperLimit = iSizeUpperLimit, bqFilter = bqFilter, chr = chr, N = N, downsampleToCov = downsampleToCov, sampleNames = sampleNames, inputdir = inputdir, useSoftClippedBases = useSoftClippedBases, regionName = regionName, tempdir = tempdir, chrStart = chrStart, chrEnd = chrEnd, chrLength = chrLength, save_sampleReadsInfo = save_sampleReadsInfo, use_bx_tag = use_bx_tag, bxTagUpperLimit = bxTagUpperLimit) 15: generate_input(bundling_info = bundling_info, L = L, pos = pos, nSNPs = nSNPs, bam_files = bam_files, cram_files = cram_files, reference = reference, iSizeUpperLimit = iSizeUpperLimit, bqFilter = bqFilter, chr = chr, N = N, downsampleToCov = downsampleToCov, sampleNames = sampleNames, inputdir = inputdir, useSoftClippedBases = useSoftClippedBases, regionName = regionName, tempdir = tempdir, chrStart = chrStart, chrEnd = chrEnd, nCores = nCores, save_sampleReadsInfo = save_sampleReadsInfo, use_bx_tag = use_bx_tag, bxTagUpperLimit = bxTagUpperLimit) 16: generate_or_refactor_input(regenerateInput = regenerateInput, bundling_info = bundling_info, L = L, pos = pos, nSNPs = nSNPs, bam_files = bam_files, cram_files = cram_files, reference = reference, iSizeUpperLimit = iSizeUpperLimit, bqFilter = bqFilter, chr = chr, outputdir = outputdir, N = N, downsampleToCov = downsampleToCov, sampleNames = sampleNames, inputdir = inputdir, useSoftClippedBases = useSoftClippedBases, regionName = regionName, tempdir = tempdir, chrStart = chrStart, chrEnd = chrEnd, generateInputOnly = generateInputOnly, nCores = nCores, save_sampleReadsInfo = save_sampleReadsInfo, use_bx_tag = use_bx_tag, bxTagUpperLimit = bxTagUpperLimit) 17: STITCH(chr = opt$chr, posfile = opt$posfile, K = opt$K, S = opt$S, nGen = opt$nGen, outputdir = opt$outputdir, tempdir = opt$tempdir, bamlist = opt$bamlist, cramlist = opt$cramlist, sampleNames_file = opt$sampleNames_file, reference = opt$reference, genfile = opt$genfile, method = opt$method, output_format = opt$output_format, B_bit_prob = opt$B_bit_prob, outputInputInVCFFormat = opt$outputInputInVCFFormat, downsampleToCov = opt$downsampleToCov, downsampleFraction = opt$downsampleFraction, readAware = opt$readAware, chrStart = opt$chrStart, chrEnd = opt$chrEnd, regionStart = opt$regionStart, regionEnd = opt$regionEnd, buffer = opt$buffer, maxDifferenceBetweenReads = opt$maxDifferenceBetweenReads, maxEmissionMatrixDifference = opt$maxEmissionMatrixDifference, alphaMatThreshold = opt$alphaMatThreshold, emissionThreshold = opt$emissionThreshold, iSizeUpperLimit = opt$iSizeUpperLimit, bqFilter = opt$bqFilter, niterations = opt$niterations, shuffleHaplotypeIterations = eval(parse(text = opt$shuffleHaplotypeIterations)), splitReadIterations = eval(parse(text = opt$splitReadIterations)), nCores = opt$nCores, expRate = opt$expRate, maxRate = opt$maxRate, minRate = opt$minRate, Jmax = opt$Jmax, regenerateInput = opt$regenerateInput, originalRegionName = opt$originalRegionName, keepInterimFiles = opt$keepInterimFiles, keepTempDir = opt$keepTempDir, switchModelIteration = opt$switchModelIteration, generateInputOnly = opt$generateInputOnly, restartIterations = opt$restartIterations, refillIterations = eval(parse(text = opt$refillIterations)), downsampleSamples = opt$downsampleSamples, downsampleSamplesKeepList = opt$downsampleSamplesKeepList, subsetSNPsfile = opt$subsetSNPsfile, useSoftClippedBases = opt$useSoftClippedBases, outputBlockSize = opt$outputBlockSize, outputSNPBlockSize = opt$outputSNPBlockSize, inputBundleBlockSize = opt$inputBundleBlockSize, genetic_map_file = opt$genetic_map_file, reference_haplotype_file = opt$reference_haplotype_file, reference_legend_file = opt$reference_legend_file, reference_sample_file = opt$reference_sample_file, reference_populations = eval(parse(text = opt$reference_populations)), reference_phred = opt$reference_phred, reference_iterations = opt$reference_iterations, reference_shuffleHaplotypeIterations = eval(parse(text = opt$reference_shuffleHaplotypeIterations)), output_filename = opt$output_filename, initial_min_hapProb = opt$initial_min_hapProb, initial_max_hapProb = opt$initial_max_hapProb, regenerateInputWithDefaultValues = opt$regenerateInputWithDefaultValues, plotHapSumDuringIterations = opt$plotHapSumDuringIterations, plot_shuffle_haplotype_attempts = opt$plot_shuffle_haplotype_attempts, plotAfterImputation = opt$plotAfterImputation, save_sampleReadsInfo = opt$save_sampleReadsInfo, gridWindowSize = opt$gridWindowSize, shuffle_bin_nSNPs = opt$shuffle_bin_nSNPs, shuffle_bin_radius = opt$shuffle_bin_radius, keepSampleReadsInRAM = opt$keepSampleReadsInRAM, useTempdirWhileWriting = opt$useTempdirWhileWriting, output_haplotype_dosages = opt$output_haplotype_dosages, use_bx_tag = opt$use_bx_tag, bxTagUpperLimit = opt$bxTagUpperLimit) An irrecoverable exception occurred. R is aborting now ... [2023-02-06 12:04:31] Load and convert BAM 100 of 500 [2023-02-06 12:04:33] Load and convert BAM 300 of 500 [2023-02-06 12:04:35] downsample sample 220 - 8 of 310718 reads removed [2023-02-06 12:05:25] downsample sample 1196 - 16 of 315137 reads removed [2023-02-06 12:06:08] downsample sample 1419 - 10 of 402569 reads removed [2023-02-06 12:06:25] downsample sample 97 - 10 of 236219 reads removed

caught segfault address 0x36d68000, cause 'memory not mapped'

Traceback: 1: cpp_read_reassign(ord = ord, qnameInteger_ord = qnameInteger_ord, bxtagInteger_ord = bxtagInteger_ord, bxtag_bad_ord = bxtag_bad_ord, qname = qname, bxtag = bxtag, strand = strand, sampleReadsRaw = sampleReadsRaw, readStart_ord = readStart_ord, readEnd_ord = readEnd_ord, readStart = readStart, readEnd = readEnd, iSizeUpperLimit = iSizeUpperLimit, bxTagUpperLimit = bxTagUpperLimit, use_bx_tag = use_bx_tag, save_sampleReadsInfo = save_sampleReadsInfo) 2: merge_reads_from_sampleReadsRaw(sampleReadsRaw = sampleReadsRaw, qname = qname, bxtag = bxtag, strand = strand, readStart = readStart, readEnd = readEnd, iSizeUpperLimit = iSizeUpperLimit, use_bx_tag = use_bx_tag, bxTagUpperLimit = bxTagUpperLimit, save_sampleReadsInfo = save_sampleReadsInfo, qname_all = qname_all, readStart_all = readStart_all, readEnd_all = readEnd_all) 3: loadBamAndConvert(iBam = iBam, L = L, pos = pos, nSNPs = nSNPs, bam_files = bam_files, cram_files = cram_files, reference = reference, iSizeUpperLimit = iSizeUpperLimit, bqFilter = bqFilter, chr = chr, N = N, downsampleToCov = downsampleToCov, sampleNames = sampleNames, inputdir = inputdir, useSoftClippedBases = useSoftClippedBases, regionName = regionName, tempdir = tempdir, chrStart = chrStart, chrEnd = chrEnd, chrLength = chrLength, save_sampleReadsInfo = save_sampleReadsInfo, use_bx_tag = use_bx_tag, bxTagUpperLimit = bxTagUpperLimit) 4: FUN(X[[i]], ...) 5: lapply(X = S, FUN = FUN, ...) 6: doTryCatch(return(expr), name, parentenv, handler) 7: tryCatchOne(expr, names, parentenv, handlers[[1L]]) 8: tryCatchList(expr, classes, parentenv, handlers) 9: tryCatch(expr, error = function(e) { call <- conditionCall(e) if (!is.null(call)) { if (identical(call[[1L]], quote(doTryCatch))) call <- sys.call(-4L) dcall <- deparse(call, nlines = 1L) prefix <- paste("Error in", dcall, ": ") LONG <- 75L sm <- strsplit(conditionMessage(e), "\n")[[1L]] w <- 14L + nchar(dcall, type = "w") + nchar(sm[1L], type = "w") if (is.na(w)) w <- 14L + nchar(dcall, type = "b") + nchar(sm[1L], type = "b") if (w > LONG) prefix <- paste0(prefix, "\n ") } else prefix <- "Error : " msg <- paste0(prefix, conditionMessage(e), "\n") .Internal(seterrmessage(msg[1L])) if (!silent && isTRUE(getOption("show.error.messages"))) { cat(msg, file = outFile) .Internal(printDeferredWarnings()) } invisible(structure(msg, class = "try-error", condition = e))}) 10: try(lapply(X = S, FUN = FUN, ...), silent = TRUE) 11: sendMaster(try(lapply(X = S, FUN = FUN, ...), silent = TRUE)) 12: FUN(X[[i]], ...) 13: lapply(seq_len(cores), inner.do) 14: mclapply(1:length(sampleRanges), mc.cores = nCores, FUN = loadBamAndConvert_across_a_range, sampleRanges = sampleRanges, bundling_info = bundling_info, L = L, pos = pos, nSNPs = nSNPs, bam_files = bam_files, cram_files = cram_files, reference = reference, iSizeUpperLimit = iSizeUpperLimit, bqFilter = bqFilter, chr = chr, N = N, downsampleToCov = downsampleToCov, sampleNames = sampleNames, inputdir = inputdir, useSoftClippedBases = useSoftClippedBases, regionName = regionName, tempdir = tempdir, chrStart = chrStart, chrEnd = chrEnd, chrLength = chrLength, save_sampleReadsInfo = save_sampleReadsInfo, use_bx_tag = use_bx_tag, bxTagUpperLimit = bxTagUpperLimit) 15: generate_input(bundling_info = bundling_info, L = L, pos = pos, nSNPs = nSNPs, bam_files = bam_files, cram_files = cram_files, reference = reference, iSizeUpperLimit = iSizeUpperLimit, bqFilter = bqFilter, chr = chr, N = N, downsampleToCov = downsampleToCov, sampleNames = sampleNames, inputdir = inputdir, useSoftClippedBases = useSoftClippedBases, regionName = regionName, tempdir = tempdir, chrStart = chrStart, chrEnd = chrEnd, nCores = nCores, save_sampleReadsInfo = save_sampleReadsInfo, use_bx_tag = use_bx_tag, bxTagUpperLimit = bxTagUpperLimit) 16: generate_or_refactor_input(regenerateInput = regenerateInput, bundling_info = bundling_info, L = L, pos = pos, nSNPs = nSNPs, bam_files = bam_files, cram_files = cram_files, reference = reference, iSizeUpperLimit = iSizeUpperLimit, bqFilter = bqFilter, chr = chr, outputdir = outputdir, N = N, downsampleToCov = downsampleToCov, sampleNames = sampleNames, inputdir = inputdir, useSoftClippedBases = useSoftClippedBases, regionName = regionName, tempdir = tempdir, chrStart = chrStart, chrEnd = chrEnd, generateInputOnly = generateInputOnly, nCores = nCores, save_sampleReadsInfo = save_sampleReadsInfo, use_bx_tag = use_bx_tag, bxTagUpperLimit = bxTagUpperLimit) 17: STITCH(chr = opt$chr, posfile = opt$posfile, K = opt$K, S = opt$S, nGen = opt$nGen, outputdir = opt$outputdir, tempdir = opt$tempdir, bamlist = opt$bamlist, cramlist = opt$cramlist, sampleNames_file = opt$sampleNames_file, reference = opt$reference, genfile = opt$genfile, method = opt$method, output_format = opt$output_format, B_bit_prob = opt$B_bit_prob, outputInputInVCFFormat = opt$outputInputInVCFFormat, downsampleToCov = opt$downsampleToCov, downsampleFraction = opt$downsampleFraction, readAware = opt$readAware, chrStart = opt$chrStart, chrEnd = opt$chrEnd, regionStart = opt$regionStart, regionEnd = opt$regionEnd, buffer = opt$buffer, maxDifferenceBetweenReads = opt$maxDifferenceBetweenReads, maxEmissionMatrixDifference = opt$maxEmissionMatrixDifference, alphaMatThreshold = opt$alphaMatThreshold, emissionThreshold = opt$emissionThreshold, iSizeUpperLimit = opt$iSizeUpperLimit, bqFilter = opt$bqFilter, niterations = opt$niterations, shuffleHaplotypeIterations = eval(parse(text = opt$shuffleHaplotypeIterations)), splitReadIterations = eval(parse(text = opt$splitReadIterations)), nCores = opt$nCores, expRate = opt$expRate, maxRate = opt$maxRate, minRate = opt$minRate, Jmax = opt$Jmax, regenerateInput = opt$regenerateInput, originalRegionName = opt$originalRegionName, keepInterimFiles = opt$keepInterimFiles, keepTempDir = opt$keepTempDir, switchModelIteration = opt$switchModelIteration, generateInputOnly = opt$generateInputOnly, restartIterations = opt$restartIterations, refillIterations = eval(parse(text = opt$refillIterations)), downsampleSamples = opt$downsampleSamples, downsampleSamplesKeepList = opt$downsampleSamplesKeepList, subsetSNPsfile = opt$subsetSNPsfile, useSoftClippedBases = opt$useSoftClippedBases, outputBlockSize = opt$outputBlockSize, outputSNPBlockSize = opt$outputSNPBlockSize, inputBundleBlockSize = opt$inputBundleBlockSize, genetic_map_file = opt$genetic_map_file, reference_haplotype_file = opt$reference_haplotype_file, reference_legend_file = opt$reference_legend_file, reference_sample_file = opt$reference_sample_file, reference_populations = eval(parse(text = opt$reference_populations)), reference_phred = opt$reference_phred, reference_iterations = opt$reference_iterations, reference_shuffleHaplotypeIterations = eval(parse(text = opt$reference_shuffleHaplotypeIterations)), output_filename = opt$output_filename, initial_min_hapProb = opt$initial_min_hapProb, initial_max_hapProb = opt$initial_max_hapProb, regenerateInputWithDefaultValues = opt$regenerateInputWithDefaultValues, plotHapSumDuringIterations = opt$plotHapSumDuringIterations, plot_shuffle_haplotype_attempts = opt$plot_shuffle_haplotype_attempts, plotAfterImputation = opt$plotAfterImputation, save_sampleReadsInfo = opt$save_sampleReadsInfo, gridWindowSize = opt$gridWindowSize, shuffle_bin_nSNPs = opt$shuffle_bin_nSNPs, shuffle_bin_radius = opt$shuffle_bin_radius, keepSampleReadsInRAM = opt$keepSampleReadsInRAM, useTempdirWhileWriting = opt$useTempdirWhileWriting, output_haplotype_dosages = opt$output_haplotype_dosages, use_bx_tag = opt$use_bx_tag, bxTagUpperLimit = opt$bxTagUpperLimit) An irrecoverable exception occurred. R is aborting now ... [2023-02-06 12:08:28] Load and convert BAM 500 of 500 [2023-02-06 12:08:28] downsample sample 1218 - 10 of 240619 reads removed [2023-02-06 12:08:51] downsample sample 1140 - 8 of 325763 reads removed [2023-02-06 12:11:40] Done generating inputs [2023-02-06 12:11:40] Copying files onto tempdir [2023-02-06 12:12:54] Done copying files onto tempdir [2023-02-06 12:12:54] Generate allele count [2023-02-06 12:13:56] Error in readChar(con, 5L, useBytes = TRUE) : cannot open the connection

Error in check_mclapply_OK(out2) : An error occured during STITCH. The first such error is above Calls: STITCH -> buildAlleleCount -> check_mclapply_OK In addition: Warning messages: 1: In mclapply(1:length(sampleRanges), mc.cores = nCores, FUN = loadBamAndConvert_across_a_range, : scheduled cores 11, 13 did not deliver results, all values of the jobs will be affected 2: In mclapply(sampleRanges, mc.cores = nCores, FUN = buildAlleleCount_subfunction, : scheduled cores 13, 11 encountered errors in user code, all values of the jobs will be affected Execution halted

I would like to ask how to solve this error problem? I am looking forward to your reply. Thank you

rwdavies commented 1 year ago

Is this the latest version of STITCH? I fixed a bug recently which can cause errors like this

Dengzhaoo commented 1 year ago

no,how can I update my STITCH

Dengzhaoo commented 1 year ago

I installed STITCH with conda, can I update STITCH by reinstalling STITCH with conda ?

rwdavies commented 1 year ago

right, it seems I never put that new version up. I'll do that now. Might take a day or two to propagate through. alternatively there's a non-conda version on the main site that would work now.

rwdavies commented 1 year ago

note I've started updating bioconda r-stitch here

https://github.com/bioconda/bioconda-recipes/pull/39190

Dengzhaoo commented 1 year ago

Thank you for replying to me,I update my STITCH and solved the previous Bug,but I have this bug now

Warning messages: 1: package ‘STITCH’ was built under R version 4.2.1 2: package ‘rrbgen’ was built under R version 4.2.1 [2023-02-07 13:23:57] Running STITCH(chr = 25, nGen = 125, posfile = chr25_pos.txt, K = 22, S = 1, outputdir = ./, nStarts = , tempdir = ./, bamlist = bam1.txt, cramlist = , sampleNames_file = , reference = , genfile = , method = diploid, output_format = bgvcf, B_bit_prob = 16, outputInputInVCFFormat = FALSE, downsampleToCov = 50, downsampleFraction = 1, readAware = TRUE, chrStart = NA, chrEnd = NA, regionStart = NA, regionEnd = NA, buffer = NA, maxDifferenceBetweenReads = 1000, maxEmissionMatrixDifference = 1e+10, alphaMatThreshold = 1e-04, emissionThreshold = 1e-04, iSizeUpperLimit = 600, bqFilter = 17, niterations = 40, shuffleHaplotypeIterations = c(4, 8, 12, 16), splitReadIterations = 25, nCores = 48, expRate = 0.5, maxRate = 100, minRate = 0.1, Jmax = 1000, regenerateInput = TRUE, originalRegionName = NA, keepInterimFiles = FALSE, keepTempDir = FALSE, outputHaplotypeProbabilities = FALSE, switchModelIteration = NA, generateInputOnly = FALSE, restartIterations = NA, refillIterations = c(6, 10, 14, 18), downsampleSamples = 1, downsampleSamplesKeepList = NA, subsetSNPsfile = NA, useSoftClippedBases = FALSE, outputBlockSize = 1000, outputSNPBlockSize = 10000, inputBundleBlockSize = NA, genetic_map_file = , reference_haplotype_file = , reference_legend_file = , reference_sample_file = , reference_populations = NA, reference_phred = 20, reference_iterations = 40, reference_shuffleHaplotypeIterations = c(4, 8, 12, 16), output_filename = NULL, initial_min_hapProb = 0.2, initial_max_hapProb = 0.8, regenerateInputWithDefaultValues = FALSE, plotHapSumDuringIterations = FALSE, plot_shuffle_haplotype_attempts = FALSE, plotAfterImputation = TRUE, save_sampleReadsInfo = FALSE, gridWindowSize = NA, shuffle_bin_nSNPs = NULL, shuffle_bin_radius = 5000, keepSampleReadsInRAM = FALSE, useTempdirWhileWriting = FALSE, output_haplotype_dosages = FALSE, use_bx_tag = TRUE, bxTagUpperLimit = 50000) [2023-02-07 13:23:57] Program start [2023-02-07 13:23:57] Get and validate pos and gen [2023-02-07 13:23:58] Done get and validate pos and gen [2023-02-07 13:23:58] Get BAM sample names [2023-02-07 13:24:00] Done getting BAM sample names [2023-02-07 13:24:00] Generate inputs [2023-02-07 13:24:13] Load and convert BAM 200 of 500 [2023-02-07 13:24:15] downsample sample 412 - 14 of 308810 reads removed [2023-02-07 13:24:22] downsample sample 100 - 10 of 294278 reads removed [2023-02-07 13:24:23] downsample sample 1468 - 12 of 265190 reads removed [2023-02-07 13:24:26] Load and convert BAM 400 of 500 [2023-02-07 13:24:31] downsample sample 1218 - 10 of 240619 reads removed [2023-02-07 13:24:31] downsample sample 220 - 8 of 310718 reads removed [2023-02-07 13:24:33] downsample sample 1419 - 10 of 402569 reads removed [2023-02-07 13:24:37] downsample sample 97 - 10 of 236219 reads removed [2023-02-07 13:24:46] downsample sample 394 - 13 of 189521 reads removed [2023-02-07 13:24:51] Load and convert BAM 100 of 500 Warning: The index file is older than the data file: /media/zhangyu/955d175c-5651-4fb7-b2b5-be56f76c69ae/dengzhao_data/blackgoat/lc/stitch/markdup/173.sorted.markdup.bam.bai [2023-02-07 13:24:55] downsample sample 476 - 10 of 153079 reads removed [2023-02-07 13:24:55] Load and convert BAM 300 of 500 [2023-02-07 13:25:18] downsample sample 353 - 13 of 225166 reads removed [2023-02-07 13:25:23] Load and convert BAM 500 of 500 [2023-02-07 13:25:25] downsample sample 1196 - 16 of 315137 reads removed [2023-02-07 13:25:36] downsample sample 1140 - 8 of 325763 reads removed [2023-02-07 13:25:50] downsample sample 1169 - 14 of 372656 reads removed [2023-02-07 13:26:06] Done generating inputs [2023-02-07 13:26:06] Copying files onto tempdir [2023-02-07 13:26:25] Done copying files onto tempdir [2023-02-07 13:26:25] Generate allele count [2023-02-07 13:26:57] Quantiles across SNPs of per-sample depth of coverage [2023-02-07 13:26:57] 5% 25% 50% 75% 95% [2023-02-07 13:26:57] 0.785 1.449 1.607 1.714 1.857 [2023-02-07 13:26:57] Done generating allele count [2023-02-07 13:26:57] Outputting will be done in 51 blocks with on average 9840.2 SNPs in them [2023-02-07 13:26:57] Begin parameter initialization [2023-02-07 13:26:58] Done parameter initialization [2023-02-07 13:26:58] Start EM [2023-02-07 13:26:58] Number of samples: 500 [2023-02-07 13:26:58] Number of SNPs: 501850 [2023-02-07 13:26:58] Start of iteration 1 Error in priorSum_m[, s] : incorrect number of dimensions Calls: STITCH -> completeSampleIteration -> calculate_updates In addition: Warning message: In mclapply(sampleRanges, mc.cores = nCores, FUN = subset_of_complete_iteration, : scheduled cores 2, 4, 7, 12, 16, 23, 35 did not deliver results, all values of the jobs will be affected Execution halted

rwdavies commented 1 year ago

That's a bit confusing, so this bit

scheduled cores 2, 4, 7, 12, 16, 23, 35 did not deliver results, all values of the jobs will be affected

and the text above suggests that this function

subset_of_complete_iteration

where the HMM is run, threw the error. But I don't see the error message.

Is there any chance the sub-processes ran out of RAM? 48 is a lot of cores for 500,000 SNPs

rwdavies commented 1 year ago

Note though this is no longer an issue for you I am now re-trying this with a newer version, which is unlikely to fix your bug, but related to ensuring I have the latest version of the code to get through the bioconda build environment

https://github.com/bioconda/bioconda-recipes/pull/39209

Dengzhaoo commented 1 year ago

Thank you very much, I tried to reduce the core and solved this problem.