BenjaminPeter / admixfrog

6 stars 3 forks source link

KeyError: "['NEA'] not in index" #1

Closed ma-diroma closed 3 years ago

ma-diroma commented 3 years ago

Hi,

I would like to use admixfrog on my samples. I am testing it with known samples from the 1240K SNPs datasets (https://reich.hms.harvard.edu/allen-ancient-dna-resource-aadr-downloadable-genotypes-present-day-and-ancient-dna-data) so my input is in eigenstrat format.

It is not clear to me if I cannot use preliminary analysis to get target and reference files when using data in this format.

However, I tried to launch the command line in a similar way to the one you suggest as quickstart

admixfrog --gfile v44.3_1240K_public --target UstIshim_snpAD.DG --states NEA=Vindija.DG+Altai.DG YRI=Yoruba.DG Denisova.DG --cont YRI --out quickstart I got this error

Traceback (most recent call last):
File "/home/student6/anaconda3/envs/python38/bin/admixfrog", line 8, in <module> sys.exit(run())
File "/home/student6/anaconda3/envs/python38/lib/python3.8/site-packages/admixfrog/interface.py", line 454, in run run_admixfrog(**V, **algo_pars, **geno_pars,
File "/home/student6/anaconda3/envs/python38/lib/python3.8/site-packages/admixfrog/admixfrog.py", line 402, in run_admixfrog df, df, sex, tot_n_snps = load_admixfrog_data(target_file = target_file,
File "/home/student6/anaconda3/envs/python38/lib/python3.8/site-packages/admixfrog/admixfrog.py", line 239, in load_admixfrog_data df = read_geno_ref(fname=geno_File, pops=state_dict, 
File "/home/student6/anaconda3/envs/python38/lib/python3.8/site-packages/admixfrog/geno_io.py", line 173, in read_geno_ref  Y = read_geno(*args, **kwargs)                                                                                                       
File "/home/student6/anaconda3/envs/python38/lib/python3.8/site-packages/admixfrog/geno_io.py", line 107, in read_geno Y = Y[set(pops.values())] 
File "/home/student6/anaconda3/envs/python38/lib/python3.8/site-packages/pandas/core/frame.py", line 3030, in __getitem__indexer = self.loc._get_listlike_indexer(key, axis=1, raise_missing=True)[1]                                                         
File "/home/student6/anaconda3/envs/python38/lib/python3.8/site-packages/pandas/core/indexing.py", line 1254, in _get_listlike_indexer 
indexer,keyarr=ax._convert_listlike_indexer(key)                                                                                 
File "/home/student6/anaconda3/envs/python38/lib/python3.8/site-packages/pandas/core/indexes/multi.py", line 2568, in _convert_listlike_indexer
raise KeyError(f"{keyarr[mask]} not in index")
KeyError: "['NEA'] not in index" 

Could you please help me?

Thanks.

BW, Maria Angela

BenjaminPeter commented 3 years ago

Hi Maria, could you double-check that the samples you are using are indeed present in your input file? Also note that the capacity to read geno files is quite experimental (i.e. expect bugs), and admixfrog will read the entire file into memory, so either run this on a computer with a lot of ram, or subset the input file first.

On Tue, Apr 13, 2021 at 6:14 PM Maria Angela Diroma < @.***> wrote:

Hi,

I would like to use admixfrog on my samples. I am testing it with known samples from the 1240K SNPs datasets ( https://reich.hms.harvard.edu/allen-ancient-dna-resource-aadr-downloadable-genotypes-present-day-and-ancient-dna-data) so my input is in eigenstrat format.

It is not clear to me if I cannot use preliminary analysis to get target and reference files when using data in this format.

However, I tried to launch the command line in a similar way to the one you suggest as quickstart

admixfrog --gfile v44.3_1240K_public --target UstIshim_snpAD.DG --states NEA=Vindija.DG+Altai.DG YRI=Yoruba.DG Denisova.DG --cont YRI --out quickstart I got this error

Traceback (most recent call last): File "/home/student6/anaconda3/envs/python38/bin/admixfrog", line 8, in sys.exit(run()) File "/home/student6/anaconda3/envs/python38/lib/python3.8/site-packages/admixfrog/interface.py", line 454, in run run_admixfrog(V, algo_pars, *geno_pars, File "/home/student6/anaconda3/envs/python38/lib/python3.8/site-packages/admixfrog/admixfrog.py", line 402, in run_admixfrog df, df, sex, tot_n_snps = load_admixfrog_data(target_file = target_file, File "/home/student6/anaconda3/envs/python38/lib/python3.8/site-packages/admixfrog/admixfrog.py", line 239, in load_admixfrog_data df = read_geno_ref(fname=geno_File, pops=state_dict, File "/home/student6/anaconda3/envs/python38/lib/python3.8/site-packages/admixfrog/geno_io.py", line 173, in read_geno_ref Y = read_geno(args, **kwargs) File "/home/student6/anaconda3/envs/python38/lib/python3.8/site-packages/admixfrog/geno_io.py", line 107, in read_geno Y = Y[set(pops.values())] File "/home/student6/anaconda3/envs/python38/lib/python3.8/site-packages/pandas/core/frame.py", line 3030, in getitemindexer = self.loc._get_listlike_indexer(key, axis=1, raise_missing=True)[1] File "/home/student6/anaconda3/envs/python38/lib/python3.8/site-packages/pandas/core/indexing.py", line 1254, in _get_listlike_indexer indexer,keyarr=ax._convert_listlike_indexer(key) File "/home/student6/anaconda3/envs/python38/lib/python3.8/site-packages/pandas/core/indexes/multi.py", line 2568, in _convert_listlike_indexer raise KeyError(f"{keyarr[mask]} not in index") KeyError: "['NEA'] not in index"

Could you please help me?

Thanks.

BW, Maria Angela

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/BenjaminPeter/admixfrog/issues/1, or unsubscribe https://github.com/notifications/unsubscribe-auth/AARYLM3CVEWBRSD4BDHP3D3TIRUVZANCNFSM423U5GSQ .

ma-diroma commented 3 years ago

Hi Ben,

thanks for your prompt reply and suggestions. I have solved!

Best wishes, Maria Angela