Closed choishingwan closed 5 years ago
Partly completed except that the phenotype + covariate check is still behind the clumping.
Might want to add a flag to the genotype class to indicate if clumping and sorting has been done. Then we can shift the phenotype + covariate checking up-front.
But then, when there are multiple phenotype, we might want to check all of them upfront? That'd require more coding
PRSice now shift the reading sequence of file to
PRSice will terminate if in any point the input file is ill formed. Note that phenotype and covariate checks comes after clumping because we want to accomodate multiple phenotype input (which might lead to difference in covariate inclusion due to phenotype NA etc). While we can move the file check up front, that will be inefficient and not very practical (time consuming to check for all phenotype covariate combinations) and thus we decided against it. Users should try their best to ensure their input is correct, or if they are uncertain and would like to use PRSice to test it, then a good way will be a "dry run" of PRSice with the --no-clump
option, or to run PRSice with --print-snp
option so that if PRSice failed in phenotype and covariate check, users can still re-run PRSice without needing to do full clumping by using --extract PRSice.snp
assuming --out PRSice
is used in the first pass
Might want to check all the file inputs at the very beginning (especially covariate file). It is rather annoying that the program error out after clumping and other procedures.