AlexTISYoung / snipar

Imputation of parental genotypes, inference of sibling IBD segments, family based GWAS, and polygenic score analyses.
MIT License
23 stars 4 forks source link

KeyError: 'InfType' in impute.py #28

Closed AnnabelPerry closed 1 year ago

AnnabelPerry commented 1 year ago

Hello, I am attempting to run impute.py in a conda environment with Python version 3.9.16, pandas version 1.1.4. I am encountering the following error:

023-06-27 14:22:47,023 INFO impute - main: creating pedigree ...
2023-06-27 14:22:47,106 INFO preprocess_data - create_pedigree: loaded kinship file
Traceback (most recent call last):
  File "/home/anp9168/anaconda3/envs/sniparEnv/lib/python3.9/site-packages/pandas/core/indexes/base.py", line 2895, in get_loc
    return self._engine.get_loc(casted_key)
  File "pandas/_libs/index.pyx", line 70, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/index.pyx", line 101, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/hashtable_class_helper.pxi", line 1675, in pandas._libs.hashtable.PyObjectHashTable.get_item
  File "pandas/_libs/hashtable_class_helper.pxi", line 1683, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'InfType'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/anp9168/anaconda3/envs/sniparEnv/bin/impute.py", line 432, in 
    main(args)
  File "/home/anp9168/anaconda3/envs/sniparEnv/bin/impute.py", line 239, in main
    pedigree = create_pedigree(args.king, args.agesex)
  File "/home/anp9168/anaconda3/envs/sniparEnv/lib/python3.9/site-packages/snipar/imputation/preprocess_data.py", line 93, in create_pedigree
    mz_kin = kinship.loc[kinship['InfType']=='Dup/MZ']
  File "/home/anp9168/anaconda3/envs/sniparEnv/lib/python3.9/site-packages/pandas/core/frame.py", line 2906, in __getitem__
    indexer = self.columns.get_loc(key)
  File "/home/anp9168/anaconda3/envs/sniparEnv/lib/python3.9/site-packages/pandas/core/indexes/base.py", line 2897, in get_loc
    raise KeyError(key) from err
KeyError: 'InfType'

Here is the code I ran:

source activate sniparEnv
unset PYTHONPATH

impute.py -c --ibd IBD_Chr@ --bgen chr@ --out Imputed_Chr@ --king FirstDegreeKING_forImputation.kin0 --agesex FirstDegreeAgeSex_forImputation.txt

Here is a description of my inputs: IBD_Chr@ IBD information generated using the ibd.py command-line script

chr@ phased chromosomal information.

FirstDegreeKING_forImputation.kin0 This file is derived from a KING output file. This file was generated by a collaborator and only had values for the "ID1", "ID2", "HetHet", "IBS0", "Kinship", and "InfType" columns, so I restricted the file to just the following columns requested in the snipar documentation for the impute.py --king flag: FID1 ID1 FID2 ID2 InfType. Both the FID1 and FID2 columns are filled with NAs I also restricted the file to only first degree relatives (Kinship >= 0.177) to save space. When I got the error the first time, I double-checked that the file was separated by single spaces. The error persisted. Since the original InfType column contained only 'PO' and 'FS', I replaced the InfType values of all individuals with Kinship>0.354 with 'Dup/MZ' and re-ran the code. The error is still persisting.

FirstDegreeAgeSex_forImputation.txt Describes the age and sex of all individuals in FirstDegreeKING_forImputation.kin0 . The columns are “FID”, “IID”, “FATHER_ID”, “MOTHER_ID”, “sex”, “age”. The "FID" column contains all NAs, while the FATHER_ID column is NA unless the individual in the IID column has a PO relationship with a male who is at least 12 years older. Likewise, the MOTHER_ID column is NA unless the individual in the IID column as a PO relationship with a female who is at least 12 years older.

AlexTISYoung commented 1 year ago

I think this may because your file is formatted with white space delimiter. The king output is tab separated.

On Tue, Jun 27, 2023, 12:02 PM Annabel Perry @.***> wrote:

Hello, I am attempting to run impute.py in a conda environment with Python version 3.9.16, pandas version 1.1.4. I am encountering the following error:

023-06-27 14:22:47,023 INFO impute - main: creating pedigree ... 2023-06-27 14:22:47,106 INFO preprocess_data - create_pedigree: loaded kinship file Traceback (most recent call last): File "/home/anp9168/anaconda3/envs/sniparEnv/lib/python3.9/site-packages/pandas/core/indexes/base.py", line 2895, in get_loc return self._engine.get_loc(casted_key) File "pandas/_libs/index.pyx", line 70, in pandas._libs.index.IndexEngine.get_loc File "pandas/_libs/index.pyx", line 101, in pandas._libs.index.IndexEngine.get_loc File "pandas/_libs/hashtable_class_helper.pxi", line 1675, in pandas._libs.hashtable.PyObjectHashTable.get_item File "pandas/_libs/hashtable_class_helper.pxi", line 1683, in pandas._libs.hashtable.PyObjectHashTable.get_item KeyError: 'InfType'

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "/home/anp9168/anaconda3/envs/sniparEnv/bin/impute.py", line 432, in main(args) File "/home/anp9168/anaconda3/envs/sniparEnv/bin/impute.py", line 239, in main pedigree = create_pedigree(args.king, args.agesex) File "/home/anp9168/anaconda3/envs/sniparEnv/lib/python3.9/site-packages/snipar/imputation/preprocess_data.py", line 93, in create_pedigree mz_kin = kinship.loc[kinship['InfType']=='Dup/MZ'] File "/home/anp9168/anaconda3/envs/sniparEnv/lib/python3.9/site-packages/pandas/core/frame.py", line 2906, in getitem indexer = self.columns.get_loc(key) File "/home/anp9168/anaconda3/envs/sniparEnv/lib/python3.9/site-packages/pandas/core/indexes/base.py", line 2897, in get_loc raise KeyError(key) from err KeyError: 'InfType'

Here is the code I ran:

source activate sniparEnv unset PYTHONPATH

impute.py -c --ibd IBD_Chr@ --bgen chr@ --out Imputed_Chr@ --king FirstDegreeKING_forImputation.kin0 --agesex FirstDegreeAgeSex_forImputation.txt

Here is a description of my inputs: IBD_Chr@ IBD information generated using the ibd.py command-line script

chr@ phased chromosomal information.

FirstDegreeKING_forImputation.kin0 This file is derived from a KING output file. This file was generated by a collaborator and only had values for the "ID1", "ID2", "HetHet", "IBS0", "Kinship", and "InfType" columns, so I restricted the file to just the following columns requested in the snipar documentation https://snipar.readthedocs.io/en/latest/scripts.html#id2 for the impute.py --king flag: FID1 ID1 FID2 ID2 InfType. Both the FID1 and FID2 columns are filled with NAs I also restricted the file to only first degree relatives (Kinship >= 0.177) to save space. When I got the error the first time, I double-checked that the file was separated by single spaces. The error persisted. Since the original InfType column contained only 'PO' and 'FS', I replaced the InfType values of all individuals with Kinship>0.354 with 'Dup/MZ' and re-ran the code. The error is still persisting.

FirstDegreeAgeSex_forImputation.txt Describes the age and sex of all individuals in FirstDegreeKING_forImputation.kin0 . The columns are “FID”, “IID”, “FATHER_ID”, “MOTHER_ID”, “sex”, “age”. The "FID" column contains all NAs, while the FATHER_ID column is NA unless the individual in the IID column has a PO relationship with a male who is at least 12 years older. Likewise, the MOTHER_ID column is NA unless the individual in the IID column as a PO relationship with a female who is at least 12 years older.

— Reply to this email directly, view it on GitHub https://github.com/AlexTISYoung/snipar/issues/28, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABQQS6MNTRNCUJ5LN5DI5HDXNMU4JANCNFSM6AAAAAAZWARUOE . You are receiving this because you are subscribed to this thread.Message ID: @.***>

AlexTISYoung commented 1 year ago

I realize the documentation says otherwise - that's an error.

On Tue, Jun 27, 2023, 12:06 PM Alexander Young @.***> wrote:

I think this may because your file is formatted with white space delimiter. The king output is tab separated.

On Tue, Jun 27, 2023, 12:02 PM Annabel Perry @.***> wrote:

Hello, I am attempting to run impute.py in a conda environment with Python version 3.9.16, pandas version 1.1.4. I am encountering the following error:

023-06-27 14:22:47,023 INFO impute - main: creating pedigree ... 2023-06-27 14:22:47,106 INFO preprocess_data - create_pedigree: loaded kinship file Traceback (most recent call last): File "/home/anp9168/anaconda3/envs/sniparEnv/lib/python3.9/site-packages/pandas/core/indexes/base.py", line 2895, in get_loc return self._engine.get_loc(casted_key) File "pandas/_libs/index.pyx", line 70, in pandas._libs.index.IndexEngine.get_loc File "pandas/_libs/index.pyx", line 101, in pandas._libs.index.IndexEngine.get_loc File "pandas/_libs/hashtable_class_helper.pxi", line 1675, in pandas._libs.hashtable.PyObjectHashTable.get_item File "pandas/_libs/hashtable_class_helper.pxi", line 1683, in pandas._libs.hashtable.PyObjectHashTable.get_item KeyError: 'InfType'

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "/home/anp9168/anaconda3/envs/sniparEnv/bin/impute.py", line 432, in main(args) File "/home/anp9168/anaconda3/envs/sniparEnv/bin/impute.py", line 239, in main pedigree = create_pedigree(args.king, args.agesex) File "/home/anp9168/anaconda3/envs/sniparEnv/lib/python3.9/site-packages/snipar/imputation/preprocess_data.py", line 93, in create_pedigree mz_kin = kinship.loc[kinship['InfType']=='Dup/MZ'] File "/home/anp9168/anaconda3/envs/sniparEnv/lib/python3.9/site-packages/pandas/core/frame.py", line 2906, in getitem indexer = self.columns.get_loc(key) File "/home/anp9168/anaconda3/envs/sniparEnv/lib/python3.9/site-packages/pandas/core/indexes/base.py", line 2897, in get_loc raise KeyError(key) from err KeyError: 'InfType'

Here is the code I ran:

source activate sniparEnv unset PYTHONPATH

impute.py -c --ibd IBD_Chr@ --bgen chr@ --out Imputed_Chr@ --king FirstDegreeKING_forImputation.kin0 --agesex FirstDegreeAgeSex_forImputation.txt

Here is a description of my inputs: IBD_Chr@ IBD information generated using the ibd.py command-line script

chr@ phased chromosomal information.

FirstDegreeKING_forImputation.kin0 This file is derived from a KING output file. This file was generated by a collaborator and only had values for the "ID1", "ID2", "HetHet", "IBS0", "Kinship", and "InfType" columns, so I restricted the file to just the following columns requested in the snipar documentation https://snipar.readthedocs.io/en/latest/scripts.html#id2 for the impute.py --king flag: FID1 ID1 FID2 ID2 InfType. Both the FID1 and FID2 columns are filled with NAs I also restricted the file to only first degree relatives (Kinship >= 0.177) to save space. When I got the error the first time, I double-checked that the file was separated by single spaces. The error persisted. Since the original InfType column contained only 'PO' and 'FS', I replaced the InfType values of all individuals with Kinship>0.354 with 'Dup/MZ' and re-ran the code. The error is still persisting.

FirstDegreeAgeSex_forImputation.txt Describes the age and sex of all individuals in FirstDegreeKING_forImputation.kin0 . The columns are “FID”, “IID”, “FATHER_ID”, “MOTHER_ID”, “sex”, “age”. The "FID" column contains all NAs, while the FATHER_ID column is NA unless the individual in the IID column has a PO relationship with a male who is at least 12 years older. Likewise, the MOTHER_ID column is NA unless the individual in the IID column as a PO relationship with a female who is at least 12 years older.

— Reply to this email directly, view it on GitHub https://github.com/AlexTISYoung/snipar/issues/28, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABQQS6MNTRNCUJ5LN5DI5HDXNMU4JANCNFSM6AAAAAAZWARUOE . You are receiving this because you are subscribed to this thread.Message ID: @.***>

AnnabelPerry commented 1 year ago

Thanks! Switching the files to tab-separation worked