File check up-front - Githubissues

choishingwan commented 7 years ago

Might want to check all the file inputs at the very beginning (especially covariate file). It is rather annoying that the program error out after clumping and other procedures.

choishingwan commented 7 years ago

Partly completed except that the phenotype + covariate check is still behind the clumping.

Might want to add a flag to the genotype class to indicate if clumping and sorting has been done. Then we can shift the phenotype + covariate checking up-front.

But then, when there are multiple phenotype, we might want to check all of them upfront? That'd require more coding

choishingwan commented 5 years ago

PRSice now shift the reading sequence of file to

Check header of covariate file
Base Summary statistics
Target Sample Information
Target SNP information
Reference Sample Information
Reference SNP information
Calculate MAF in target (and do filtering)
Calculate MAF in reference (and do filtering)
Read in region files (GTF, MSigDB, BED if used)
Check header of phenotype file
Clumping
Process phenotype and covariate file

PRSice will terminate if in any point the input file is ill formed. Note that phenotype and covariate checks comes after clumping because we want to accomodate multiple phenotype input (which might lead to difference in covariate inclusion due to phenotype NA etc). While we can move the file check up front, that will be inefficient and not very practical (time consuming to check for all phenotype covariate combinations) and thus we decided against it. Users should try their best to ensure their input is correct, or if they are uncertain and would like to use PRSice to test it, then a good way will be a "dry run" of PRSice with the --no-clump option, or to run PRSice with --print-snp option so that if PRSice failed in phenotype and covariate check, users can still re-run PRSice without needing to do full clumping by using --extract PRSice.snp assuming --out PRSice is used in the first pass

choishingwan / PRSice

File check up-front #14