rwdavies / STITCH

STITCH - Sequencing To Imputation Through Constructing Haplotypes
http://www.nature.com/ng/journal/v48/n8/abs/ng.3594.html
GNU General Public License v3.0
73 stars 19 forks source link

get_and_initialize_from_reference, sample_haps_to_use, Negative probability #63

Closed Deeeeen closed 2 years ago

Deeeeen commented 2 years ago

Hi Robbie,

I am trying to run STITCH 1.6.6 with niterations=1 with a reference panel, but I keep running into this error:

[1] "Error in sample.int(length(x), size, replace, prob) : \n negative probability\n" attr(,"class") [1] "try-error" attr(,"condition") <simpleError in sample.int(length(x), size, replace, prob): negative probability>

I dug into your code a little bit, and I think the error is coming from this function: https://github.com/rwdavies/STITCH/blob/8f6df131004bf2335701c001cd5a5710f05fde0d/STITCH/R/reference.R#L1062 Specifically, when it calls the sort(sample()) function. It seems the prob parameter passed in to the sample() function is negative. Can you provide some examples or explanation of how this could happen? (I really can't figure out if there is anything wrong on the files or parameters I feed into STITCH)

Thanks!

Deeeeen commented 2 years ago

Figured it out... Because the reference panel I prepared has some issues

rwdavies commented 2 years ago

Yay!

rwdavies commented 2 years ago

What kind of issues? STITCH does some checks on the reference panel files, e.g. on the legend file if the SNPs are in sorted order, but if this is something that it could check easily, I could add that in

Deeeeen commented 2 years ago

Im not super sure where exactly the problem is. My guess is that genotypes I used to produce the reference panel files had a lot of missing sites. Hope this is helpful.

rwdavies commented 2 years ago

Ah OK, STITCH doesn't accept missing sites in the reference panel, it needs all of them to be specified