We've updated the simphenotype and index subcommands to support a new line type in the hap file "R".
R stands for repeats
Usage in a sorted hap file (tests/data/basic.hap.gz):
# version 0.1.0
H 21 26928472 26941960 chr21.q.3365*1
R 21 26938353 26938400 21_26938353_STR
H 21 26938353 26938989 chr21.q.3365*11
H 21 26938989 26941960 chr21.q.3365*10
R 21 26939000 26939010 21_26938989_STR
R 21 26941880 26941900 21_26941880_STR
V chr21.q.3365*1 26928472 26928472 21_26928472_C_A C
V chr21.q.3365*1 26938353 26938353 21_26938353_T_C T
V chr21.q.3365*1 26940815 26940815 21_26940815_T_C C
V chr21.q.3365*1 26941960 26941960 21_26941960_A_G G
V chr21.q.3365*10 26938989 26938989 21_26938989_G_A A
V chr21.q.3365*10 26940815 26940815 21_26940815_T_C T
V chr21.q.3365*10 26941960 26941960 21_26941960_A_G A
V chr21.q.3365*11 26938353 26938353 21_26938353_T_C T
V chr21.q.3365*11 26938989 26938989 21_26938989_G_A A
Along with these changes are additional changes in simphenotypes PhenoSimulator class particularly the run() function which now instead of taking in a list of haplotypes takes in the full Haplotypes object as well as the IDs of haplotypes and repeats to extract betas and genotypes.
To use repeats in simphenotype, use the additional --repeats option.
Example:
We've updated the
simphenotype
andindex
subcommands to support a new line type in the hap file "R". R stands for repeatsUsage in a sorted hap file (tests/data/basic.hap.gz):
Along with these changes are additional changes in simphenotypes PhenoSimulator class particularly the run() function which now instead of taking in a list of haplotypes takes in the full Haplotypes object as well as the IDs of haplotypes and repeats to extract betas and genotypes.
To use repeats in simphenotype, use the additional
--repeats
option. Example:Note in the example SNPs must also still be present, so we cannot simulate based on repeats alone.