KangchengHou / admix-kit

Toolkit for analyzing genetics data from admixed populations
https://kangchenghou.github.io/admix-kit
22 stars 4 forks source link

Can't Download The Reference #28

Open TinaYibingChen opened 5 months ago

TinaYibingChen commented 5 months ago

Describe the bug Hi, I downloaded the package and tried to copy the simulation example code. After the setup step (in the prepare source data section), it ran into an error. admix get-1kg-ref --dir ${REF_DATA_DIR} --build ${BUILD}. I would like to know if I need to download anything or provide more information rather than copy and paste. Thank you very much!

To Reproduce

BUILD=hg19 N_INDIV=100000 CHROM=22 N_GEN=8

REF_DATA_DIR=data/1kg-ref-${BUILD} RAW_DATA_DIR=data/raw GENO_DATA_DIR=data/geno mkdir -p ${RAW_DATA_DIR} mkdir -p ${GENO_DATA_DIR} admix get-1kg-ref --dir ${REF_DATA_DIR} --build ${BUILD}

Error Message CalledProcessError: Command 'b'# genome build to use\nBUILD=hg19\n# number of admixed individuals to simulate\nN_INDIV=100000\n# chromosome \nCHROM=22\n# number of generations\nN_GEN=8\n\n# setup\nREF_DATA_DIR=data/1kg-ref-${BUILD}\nRAW_DATA_DIR=data/raw\nGENO_DATA_DIR=data/geno\nmkdir -p ${RAW_DATA_DIR}\nmkdir -p ${GENO_DATA_DIR}\n\n# Create the necessary directories if they do not already exist\nmkdir -p ${RAW_DATA_DIR}\nmkdir -p ${GENO_DATA_DIR}\n\nadmix get-1kg-ref --dir ${REF_DATA_DIR} --build ${BUILD}\n\n'' returned non-zero exit status 1.

KangchengHou commented 5 months ago

Hi, can you confirm where you typed this command - in python or in shell? Providing a screenshot would be helpful.

TinaYibingChen commented 5 months ago

Here are the screenshots, thanks! image (1) image (2)

KangchengHou commented 5 months ago

could you try this within a shell - which should provide more information

TinaYibingChen commented 5 months ago

Sure! I try that line in a shell.

截屏2024-06-04 17 10 01 截屏2024-06-04 17 10 09
KangchengHou commented 5 months ago

The following line shows that the build = hg19 was not well-received, could you try it again with the correct input arguments?

image
TinaYibingChen commented 5 months ago

I have tried other builds (hg38) in your example code but encountered the same error. Could you please provide more details about what the correct input arguments should be? Thank you very much!

image
KangchengHou commented 5 months ago

the input commands are correct - but could you again put these commands in a terminal so we can have the complete error message?

TinaYibingChen commented 5 months ago

Yes, this is the complete error message in my terminal.

image image
KangchengHou commented 5 months ago

Hi @TinaYibingChen

It looks like it is trying to download plink2 (https://www.cog-genomics.org/plink/2.0/) but could not access the link.

Alternatively, you can download the plink2 manually and add the binary file to the $PATH (you can check that plink2 is in $PATH by typing plink2 in the command line). Feel free to let me know if you have further questions.

TinaYibingChen commented 5 months ago

Hi @KangchengHou

Thank you so much for your instructions! I successfully downloaded the plink2 (PLINK v2.00a6 M1 (9 Jun 2024)) and added the binary file to the $PATH. But I got this when I had the same input before and typed admix get-1kg-ref --dir ${REF_DATA_DIR} --build ${BUILD}

image

TinaYibingChen commented 5 months ago

Hi, I have a follow-up question about how to use HAPGEN2 on a Macbook. Thank you for taking the time to look at it.

image

This is what I tried to duplicate. I got the error below.

(base) apple@ChendeMacBook-Air admix-kit % for pop in CEU YRI; do
admix hapgen2 \ --pfile ${RAW_DATA_DIR}/${pop} \ --chrom ${CHROM} \ --n-indiv ${N_INDIV} \ --out ${GENO_DATA_DIR}/${pop} \ --build ${BUILD} done 2024-06-24 16:51:50 [info ] Received parameters: hapgen2 --pfile=data/raw/CEU --n_indiv=100000 --out=data/geno/CEU --build=hg38 --chrom=22 PLINK v2.00a6 AVX2 (9 Jun 2024) www.cog-genomics.org/plink/2.0/ (C) 2005-2024 Shaun Purcell, Christopher Chang GNU General Public License v3 Logging to data/geno/CEU.hapgen2tmpdata/hapgen2_data.log. Options in effect: --chr 22 --export hapslegend --out data/geno/CEU.hapgen2tmpdata/hapgen2_data --pfile data/raw/CEU

Start time: Mon Jun 24 16:51:50 2024 16384 MiB RAM detected; reserving 8192 MiB for main workspace. Using up to 8 compute threads. 179 samples (92 females, 87 males; 122 founders) loaded from data/raw/CEU.psam. 15346 variants loaded from data/raw/CEU.pvar. 2 categorical phenotypes loaded. Writing data/geno/CEU.hapgen2tmpdata/hapgen2_data.legend ... done. Writing data/geno/CEU.hapgen2tmpdata/hapgen2_data.haps ... done. Writing data/geno/CEU.hapgen2tmpdata/hapgen2_data.sample ... done. End time: Mon Jun 24 16:51:50 2024 2024-06-24 16:51:50 [info ] genetic_map found at /opt/anaconda3/lib/python3.11/site-packages/admix/../.admix_cache/data/genetic_map/genetic_map_hg38_withX.txt.gz. Traceback (most recent call last): File "/opt/anaconda3/bin/admix", line 8, in sys.exit(cli()) ^^^^^ File "/opt/anaconda3/lib/python3.11/site-packages/admix/cli/init.py", line 39, in cli fire.Fire() File "/opt/anaconda3/lib/python3.11/site-packages/fire/core.py", line 143, in Fire component_trace = _Fire(component, args, parsed_flag_args, context, name) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/anaconda3/lib/python3.11/site-packages/fire/core.py", line 477, in _Fire component, remaining_args = _CallAndUpdateTrace( ^^^^^^^^^^^^^^^^^^^^ File "/opt/anaconda3/lib/python3.11/site-packages/fire/core.py", line 693, in _CallAndUpdateTrace component = fn(*varargs, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^ File "/opt/anaconda3/lib/python3.11/site-packages/admix/cli/_ext.py", line 227, in hapgen2 admix.tools.hapgen2( File "/opt/anaconda3/lib/python3.11/site-packages/admix/tools/_standalone.py", line 93, in hapgen2 hapgen2 = get_dependency("hapgen2") ^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/anaconda3/lib/python3.11/site-packages/admix/tools/_utils.py", line 137, in get_dependency raise ValueError(f"Unsupported platform {platform}") ValueError: Unsupported platform darwin 2024-06-24 16:51:56 [info ] Received parameters: hapgen2 --pfile=data/raw/YRI --n_indiv=100000 --out=data/geno/YRI --build=hg38 --chrom=22 PLINK v2.00a6 AVX2 (9 Jun 2024) www.cog-genomics.org/plink/2.0/ (C) 2005-2024 Shaun Purcell, Christopher Chang GNU General Public License v3 Logging to data/geno/YRI.hapgen2tmpdata/hapgen2_data.log. Options in effect: --chr 22 --export hapslegend --out data/geno/YRI.hapgen2tmpdata/hapgen2_data --pfile data/raw/YRI

Start time: Mon Jun 24 16:51:56 2024 16384 MiB RAM detected; reserving 8192 MiB for main workspace. Using up to 8 compute threads. 178 samples (81 females, 97 males; 120 founders) loaded from data/raw/YRI.psam. 15346 variants loaded from data/raw/YRI.pvar. 2 categorical phenotypes loaded. Writing data/geno/YRI.hapgen2tmpdata/hapgen2_data.legend ... done. Writing data/geno/YRI.hapgen2tmpdata/hapgen2_data.haps ... done. Writing data/geno/YRI.hapgen2tmpdata/hapgen2_data.sample ... done. End time: Mon Jun 24 16:51:56 2024 2024-06-24 16:51:56 [info ] genetic_map found at /opt/anaconda3/lib/python3.11/site-packages/admix/../.admix_cache/data/genetic_map/genetic_map_hg38_withX.txt.gz. Traceback (most recent call last): File "/opt/anaconda3/bin/admix", line 8, in sys.exit(cli()) ^^^^^ File "/opt/anaconda3/lib/python3.11/site-packages/admix/cli/init.py", line 39, in cli fire.Fire() File "/opt/anaconda3/lib/python3.11/site-packages/fire/core.py", line 143, in Fire component_trace = _Fire(component, args, parsed_flag_args, context, name) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/anaconda3/lib/python3.11/site-packages/fire/core.py", line 477, in _Fire component, remaining_args = _CallAndUpdateTrace( ^^^^^^^^^^^^^^^^^^^^ File "/opt/anaconda3/lib/python3.11/site-packages/fire/core.py", line 693, in _CallAndUpdateTrace component = fn(*varargs, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^ File "/opt/anaconda3/lib/python3.11/site-packages/admix/cli/_ext.py", line 227, in hapgen2 admix.tools.hapgen2( File "/opt/anaconda3/lib/python3.11/site-packages/admix/tools/_standalone.py", line 93, in hapgen2 hapgen2 = get_dependency("hapgen2") ^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/anaconda3/lib/python3.11/site-packages/admix/tools/_utils.py", line 137, in get_dependency raise ValueError(f"Unsupported platform {platform}") ValueError: Unsupported platform darwin