Closed AnnabelPerry closed 1 year ago
I think this may because your file is formatted with white space delimiter. The king output is tab separated.
On Tue, Jun 27, 2023, 12:02 PM Annabel Perry @.***> wrote:
Hello, I am attempting to run impute.py in a conda environment with Python version 3.9.16, pandas version 1.1.4. I am encountering the following error:
023-06-27 14:22:47,023 INFO impute - main: creating pedigree ... 2023-06-27 14:22:47,106 INFO preprocess_data - create_pedigree: loaded kinship file Traceback (most recent call last): File "/home/anp9168/anaconda3/envs/sniparEnv/lib/python3.9/site-packages/pandas/core/indexes/base.py", line 2895, in get_loc return self._engine.get_loc(casted_key) File "pandas/_libs/index.pyx", line 70, in pandas._libs.index.IndexEngine.get_loc File "pandas/_libs/index.pyx", line 101, in pandas._libs.index.IndexEngine.get_loc File "pandas/_libs/hashtable_class_helper.pxi", line 1675, in pandas._libs.hashtable.PyObjectHashTable.get_item File "pandas/_libs/hashtable_class_helper.pxi", line 1683, in pandas._libs.hashtable.PyObjectHashTable.get_item KeyError: 'InfType'
The above exception was the direct cause of the following exception:
Traceback (most recent call last): File "/home/anp9168/anaconda3/envs/sniparEnv/bin/impute.py", line 432, in main(args) File "/home/anp9168/anaconda3/envs/sniparEnv/bin/impute.py", line 239, in main pedigree = create_pedigree(args.king, args.agesex) File "/home/anp9168/anaconda3/envs/sniparEnv/lib/python3.9/site-packages/snipar/imputation/preprocess_data.py", line 93, in create_pedigree mz_kin = kinship.loc[kinship['InfType']=='Dup/MZ'] File "/home/anp9168/anaconda3/envs/sniparEnv/lib/python3.9/site-packages/pandas/core/frame.py", line 2906, in getitem indexer = self.columns.get_loc(key) File "/home/anp9168/anaconda3/envs/sniparEnv/lib/python3.9/site-packages/pandas/core/indexes/base.py", line 2897, in get_loc raise KeyError(key) from err KeyError: 'InfType'
Here is the code I ran:
source activate sniparEnv unset PYTHONPATH
impute.py -c --ibd IBD_Chr@ --bgen chr@ --out Imputed_Chr@ --king FirstDegreeKING_forImputation.kin0 --agesex FirstDegreeAgeSex_forImputation.txt
Here is a description of my inputs: IBD_Chr@ IBD information generated using the ibd.py command-line script
chr@ phased chromosomal information.
FirstDegreeKING_forImputation.kin0 This file is derived from a KING output file. This file was generated by a collaborator and only had values for the "ID1", "ID2", "HetHet", "IBS0", "Kinship", and "InfType" columns, so I restricted the file to just the following columns requested in the snipar documentation https://snipar.readthedocs.io/en/latest/scripts.html#id2 for the impute.py --king flag: FID1 ID1 FID2 ID2 InfType. Both the FID1 and FID2 columns are filled with NAs I also restricted the file to only first degree relatives (Kinship >= 0.177) to save space. When I got the error the first time, I double-checked that the file was separated by single spaces. The error persisted. Since the original InfType column contained only 'PO' and 'FS', I replaced the InfType values of all individuals with Kinship>0.354 with 'Dup/MZ' and re-ran the code. The error is still persisting.
FirstDegreeAgeSex_forImputation.txt Describes the age and sex of all individuals in FirstDegreeKING_forImputation.kin0 . The columns are “FID”, “IID”, “FATHER_ID”, “MOTHER_ID”, “sex”, “age”. The "FID" column contains all NAs, while the FATHER_ID column is NA unless the individual in the IID column has a PO relationship with a male who is at least 12 years older. Likewise, the MOTHER_ID column is NA unless the individual in the IID column as a PO relationship with a female who is at least 12 years older.
— Reply to this email directly, view it on GitHub https://github.com/AlexTISYoung/snipar/issues/28, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABQQS6MNTRNCUJ5LN5DI5HDXNMU4JANCNFSM6AAAAAAZWARUOE . You are receiving this because you are subscribed to this thread.Message ID: @.***>
I realize the documentation says otherwise - that's an error.
On Tue, Jun 27, 2023, 12:06 PM Alexander Young @.***> wrote:
I think this may because your file is formatted with white space delimiter. The king output is tab separated.
On Tue, Jun 27, 2023, 12:02 PM Annabel Perry @.***> wrote:
Hello, I am attempting to run impute.py in a conda environment with Python version 3.9.16, pandas version 1.1.4. I am encountering the following error:
023-06-27 14:22:47,023 INFO impute - main: creating pedigree ... 2023-06-27 14:22:47,106 INFO preprocess_data - create_pedigree: loaded kinship file Traceback (most recent call last): File "/home/anp9168/anaconda3/envs/sniparEnv/lib/python3.9/site-packages/pandas/core/indexes/base.py", line 2895, in get_loc return self._engine.get_loc(casted_key) File "pandas/_libs/index.pyx", line 70, in pandas._libs.index.IndexEngine.get_loc File "pandas/_libs/index.pyx", line 101, in pandas._libs.index.IndexEngine.get_loc File "pandas/_libs/hashtable_class_helper.pxi", line 1675, in pandas._libs.hashtable.PyObjectHashTable.get_item File "pandas/_libs/hashtable_class_helper.pxi", line 1683, in pandas._libs.hashtable.PyObjectHashTable.get_item KeyError: 'InfType'
The above exception was the direct cause of the following exception:
Traceback (most recent call last): File "/home/anp9168/anaconda3/envs/sniparEnv/bin/impute.py", line 432, in main(args) File "/home/anp9168/anaconda3/envs/sniparEnv/bin/impute.py", line 239, in main pedigree = create_pedigree(args.king, args.agesex) File "/home/anp9168/anaconda3/envs/sniparEnv/lib/python3.9/site-packages/snipar/imputation/preprocess_data.py", line 93, in create_pedigree mz_kin = kinship.loc[kinship['InfType']=='Dup/MZ'] File "/home/anp9168/anaconda3/envs/sniparEnv/lib/python3.9/site-packages/pandas/core/frame.py", line 2906, in getitem indexer = self.columns.get_loc(key) File "/home/anp9168/anaconda3/envs/sniparEnv/lib/python3.9/site-packages/pandas/core/indexes/base.py", line 2897, in get_loc raise KeyError(key) from err KeyError: 'InfType'
Here is the code I ran:
source activate sniparEnv unset PYTHONPATH
impute.py -c --ibd IBD_Chr@ --bgen chr@ --out Imputed_Chr@ --king FirstDegreeKING_forImputation.kin0 --agesex FirstDegreeAgeSex_forImputation.txt
Here is a description of my inputs: IBD_Chr@ IBD information generated using the ibd.py command-line script
chr@ phased chromosomal information.
FirstDegreeKING_forImputation.kin0 This file is derived from a KING output file. This file was generated by a collaborator and only had values for the "ID1", "ID2", "HetHet", "IBS0", "Kinship", and "InfType" columns, so I restricted the file to just the following columns requested in the snipar documentation https://snipar.readthedocs.io/en/latest/scripts.html#id2 for the impute.py --king flag: FID1 ID1 FID2 ID2 InfType. Both the FID1 and FID2 columns are filled with NAs I also restricted the file to only first degree relatives (Kinship >= 0.177) to save space. When I got the error the first time, I double-checked that the file was separated by single spaces. The error persisted. Since the original InfType column contained only 'PO' and 'FS', I replaced the InfType values of all individuals with Kinship>0.354 with 'Dup/MZ' and re-ran the code. The error is still persisting.
FirstDegreeAgeSex_forImputation.txt Describes the age and sex of all individuals in FirstDegreeKING_forImputation.kin0 . The columns are “FID”, “IID”, “FATHER_ID”, “MOTHER_ID”, “sex”, “age”. The "FID" column contains all NAs, while the FATHER_ID column is NA unless the individual in the IID column has a PO relationship with a male who is at least 12 years older. Likewise, the MOTHER_ID column is NA unless the individual in the IID column as a PO relationship with a female who is at least 12 years older.
— Reply to this email directly, view it on GitHub https://github.com/AlexTISYoung/snipar/issues/28, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABQQS6MNTRNCUJ5LN5DI5HDXNMU4JANCNFSM6AAAAAAZWARUOE . You are receiving this because you are subscribed to this thread.Message ID: @.***>
Thanks! Switching the files to tab-separation worked
Hello, I am attempting to run impute.py in a conda environment with Python version 3.9.16, pandas version 1.1.4. I am encountering the following error:
Here is the code I ran:
Here is a description of my inputs:
IBD_Chr@
IBD information generated using theibd.py
command-line scriptchr@
phased chromosomal information.FirstDegreeKING_forImputation.kin0
This file is derived from a KING output file. This file was generated by a collaborator and only had values for the "ID1", "ID2", "HetHet", "IBS0", "Kinship", and "InfType" columns, so I restricted the file to just the following columns requested in the snipar documentation for the impute.py --king flag: FID1 ID1 FID2 ID2 InfType. Both the FID1 and FID2 columns are filled with NAs I also restricted the file to only first degree relatives (Kinship >= 0.177) to save space. When I got the error the first time, I double-checked that the file was separated by single spaces. The error persisted. Since the original InfType column contained only 'PO' and 'FS', I replaced the InfType values of all individuals with Kinship>0.354 with 'Dup/MZ' and re-ran the code. The error is still persisting.FirstDegreeAgeSex_forImputation.txt
Describes the age and sex of all individuals inFirstDegreeKING_forImputation.kin0
. The columns are “FID”, “IID”, “FATHER_ID”, “MOTHER_ID”, “sex”, “age”. The "FID" column contains all NAs, while the FATHER_ID column is NA unless the individual in the IID column has a PO relationship with a male who is at least 12 years older. Likewise, the MOTHER_ID column is NA unless the individual in the IID column as a PO relationship with a female who is at least 12 years older.