rwdavies / STITCH

STITCH - Sequencing To Imputation Through Constructing Haplotypes
http://www.nature.com/ng/journal/v48/n8/abs/ng.3594.html
GNU General Public License v3.0
74 stars 19 forks source link

Error in rep(" ", y - 3) : invalid 'times' argument #7

Closed Zilong-Li closed 5 years ago

Zilong-Li commented 6 years ago

[2017-08-27 20:50:59] WARNING - sample h8uwbu44ab has no informative reads. It is being given random reads. Consider removing from analysis Failed to populate reference for id 0 Unable to fetch reference #0 148202215..151502370 Failure to decode slice [2017-08-27 20:51:00] WARNING - sample hbthn6fk49 has no informative reads. It is being given random reads. Consider removing from analysis Failed to populate reference for id 0 Unable to fetch reference #0 149260081..153657259 Failure to decode slice [2017-08-27 20:51:01] WARNING - sample hbzbtrlp6x has no informative reads. It is being given random reads. Consider removing from analysis [2017-08-27 20:51:39] Done generating inputs [2017-08-27 20:51:39] Copying files onto tempdir [2017-08-27 20:52:48] Done copying files onto tempdir [2017-08-27 20:52:48] Generate allele count [2017-08-27 20:52:51] Quantiles across SNPs of per-sample depth of coverage Error in rep(" ", y - 3) : invalid 'times' argument Calls: STITCH ... print_message -> message -> paste0 -> paste0 -> paste0 -> paste0 Execution halted

Hi, I have installed the STITCH version 1.3.6 and also run the example script for test successfully. However, when running STITCH with my own data, I got this error. I can't figure out what the "times" argument results from. What's more, when I specify the tempdir not use the default tempdir(), the STITCH always only write out this message but the program still running:

Loading required package: parallel [2017-08-27 11:12:43] Program start [2017-08-27 11:12:43] Get and validate pos and gen [2017-08-27 11:12:45] Done get and validate pos and gen [2017-08-27 11:12:48] Get CRAM sample names

This problem doesn't appear with the example data either. I think my company's linux severs maybe have some problems dealing with large cram sample.

Looking forward to your reply sincerely.

rwdavies commented 6 years ago
  1. The "times" argument is for the function rep(x, times), e.g. rep(x = 3, times = 4) would give the number 3 four times. However, the most recent version of the codebase no longer has the rep(" ", y - 3) line, this was replaced with newer, more robust code inside the print_allele_count (which has a few tests in tests/testthat/test-basics.R, and I just added one more to make sure I didn't screw something up internally). If you're building a pre-release version using ./scripts/build-and-install.R, can you make sure you're on the latest commit?

  2. Is the no reads expected? This is saying sample hbthn6fk49 has no reads in the region of interest. Some of the messages being printed out I don't recognize and I assume are from htslib. What is the command line you are running, and are you giving the CRAM reference? [2017-08-27 20:51:00] WARNING - sample hbthn6fk49 has no informative reads. It is being given random reads. Consider removing from analysis Failed to populate reference for id 0 Unable to fetch reference #0 149260081..153657259 Failure to decode slice

  3. The tempdir one is quite interesting. To confirm is this correct? Are you using a small number of CRAM files? example data, default tempdir - OK example data, specified tempdir - OK your data, default tempdir - OK your data, specified tempdir - problem The weird thing is that the part where the program is snagging (Get CRAM sample names) doesn't use the tempdir argument