Open rpauly opened 9 months ago
Thank you for your interest!
Recently we having been updating the software and forgot to update the pip source.
Could you try pip install -U admix-kit
and try the command again? Thanks
Can I use the conda package instead, using below: conda env create --file environment.yaml conda activate admix-kit
Yes this is another option - just making sure that you are cloning the master branch
Thanks another question, does this mean that each simulated individual will have CEU: 40%, YRI: 10% and PEL: 50%?
No, these parameters mean that the simulated individuals will have these proportions on average, but the ancestry proportion for each individual may vary.
thank you...so if admix-prop '{"CEU": 0.2, "YRI": 0.8}'...all the samples would have a higher proportion of YRI proportions correct? I am asking because, I am trying to simulate some data and I am trying to get individuals with the same admix proportions.
Yes the samples would have a overall higher proportion of YRI proportions - although under rare cases you may encounter some individuals with YRI < 50% purely due to the randomness.
You can simulate a bunch of individuals and then filter based on ancestry proportions afterwards
thank you...that is very interesting...is there a command I can use to filter the individuals based on ancestry proportions?
hey unfortunately there is not a command to filter - you have to manually calculate genetic ancestry and then filter by indexing those selected individuals
Thanks...so I was trying to run all the chromosomes through the pipeline and I got the first two steps: Step1:
for CHROM in {9..22}; do
source admix_env/bin/activate
admix subset-hapmap3 --pfile /data/1kg-ref-hg38/pgen/all_hg38 --build hg38 --chrom ${CHROM} --out hm3_chrom${CHROM}.snp
done
Step2:
for CHROM in {10..22}; do
plink2 --pfile /data/1kg-ref-hg38/pgen/all_hg38 --extract hm3_chrom${CHROM}.snp --make-pgen --out 1kg-ref/1kg-ref_chrom${CHROM} --allow-extra-chr
done
Step3:
admix haptools-simu-admix \
--pfile ${OUT_DIR}/1kg-ref \
--admix-prop '{"CEU": 0.2, "YRI": 0.8}' \
--pop-col Population \
--mapdir data/1kg-ref-${BUILD}/metadata/genetic_map/ \
--n-gen 10 \
--n-indiv 10000 \
--out ${OUT_DIR}/CEU-YRI
But Step3 gives me an error in the mapdir step, How do I generate genetic maps? Thanks, Rini
Could you type ls data/1kg-ref-${BUILD}/metadata
to see what is in there? It should be already downloaded using the get-1kg-ref
command.
The metadata is blank
Ok can you try this to download the genetic map
root_dir=data/1kg-ref-hg38
mkdir -p ${root_dir}/metadata/genetic_map/raw && cd ${root_dir}/metadata/genetic_map/raw
wget https://bochet.gcc.biostat.washington.edu/beagle/genetic_maps/plink.GRCh38.map.zip
unzip plink.GRCh38.map.zip
cd ../../../../../
# move raw files to out/metadata/genetic_map
for chrom in {1..22}; do
cp ${root_dir}/metadata/genetic_map/raw/plink.chr${chrom}.GRCh38.map ${root_dir}/metadata/genetic_map/chr${chrom}.map
done
# clean up
rm -rf ${root_dir}/metadata/genetic_map/raw/
Awesome that works, so for:
step3:
admix haptools-simu-admix --pfile 1kg-ref/1kg-ref_chrom10 --admix-prop '{"CEU": 0.2,"YRI": 0.8}' --pop-col Population --mapdir data/1kg-ref/metadata/genetic_map/ --n-gen 10 --n-indiv 11 --out data/example_data/CEU-YRI_n_gene_simulations_n_gen_5
In the above command, I have used only chromsome 10. Is there a way I can combine all the chromosomes pgen/pvar/psam into a single pgen/pvar/psam file and run the above command just once? Thanks!
Unfortunately you have to run jobs seperately for each chromosome. Running in parallel will speed up your analysis if you have access to a computing cluster
Thanks for all your help! It works. One last question, is there a way to calculate the genetic ancestry on the admixed sample? For instance, I saw that " .af_per_anc()" calculates allele frequencies per ancestry backgrounds. How can I use this ?
Apology for the late reply.
.af_per_anc()
is used for calculating the local ancestry-specific alllele frequency.
You can use admix lanc-count --lanc [prefix].lanc --out [prefix].lanc_count
to compute the number of alleles from each genetic ancestry
Hi, I am trying the command: admix haptools-simu-admix \ --pfile data/example_data/1kg-ref \ --admix-prop '{"CEU": 0.4, "YRI": 0.1, "PEL": 0.5}' \ --pop-col Population \ --mapdir data/1kg-ref-${BUILD}/metadata/genetic_map/ \ --n-gen 10 \ --n-indiv 10000 \ --out data/example_data/CEU-YRI-PEL
But, I get an error: ERROR: Cannot find key: haptools-simu-admix Usage: admix <group|command> available groups: fire available commands: assoc | append_snp_info | calc_pgs | calc_partial_pgs | grm | log_params | get_1kg_ref | select_admix_indiv | simulate_admix_pheno | simulate_pheno | lanc | lanc_convert | lanc_count | prune | pca | liftover | pfile_align_snp | pfile_merge_indiv | pfile_freq_within | subset_hapmap3 | subset_pop_indiv | hapgen2 | admix_simu | download_dependency | plot_joint_pca | admix_grm | admix_grm_merge | genet_cor | admix_grm_rho | estimate_genetic_cor | summarize_genet_cor | meta_analyze_genet_cor | cli
For detailed information on this command, run: admix --help