dvitale199 / GenoTools

GenoTools: Advanced Genotype Data Analysis A robust suite for processing genotype data, offering genotype calling (.idat to PLINK), comprehensive sample/variant QC, and ancestry estimation. Ideal for computational biology and genetics research.
Apache License 2.0
22 stars 7 forks source link

.linear not captured in run_gwas() for linear phenotypes #157

Closed dvitale199 closed 4 months ago

dvitale199 commented 8 months ago

Describe the bug run_gwas fails to run the lambda calculation if it is not case/control

Expected behavior

  1. automatically detect type of phenotype OR add args for "linear"/"logistic"
  2. .linear file should be captured and lambdas calculated for that. lambda1000 should not be calculated due to no cases/controls

something like this:

if os.path.isfile(f'{self.out_path}.PHENO1.glm.linear'):

            # calculate inflation
            gwas_df = pd.read_csv(f'{self.out_path}.PHENO1.glm.linear', sep='\s+', dtype={'#CHROM': str})

            # add pruning step here (pre lambdas)
            gwas_df_add = gwas_df.loc[gwas_df.TEST=='ADD']

            # calculate inflation
            lambda_dict = self.calculate_inflation(gwas_df_add.P, normalize=False)

metrics_dict = {
                'lambda': lambda_dict['metrics']['inflation'],
                'lambda1000': np.nan,
                'cases': np.nan,
                'controls': np.nan
                }

Screenshots

Screenshot 2024-01-10 at 11 00 20 AM

Additional context pipeline should be updated accordingly to account for new arguments if necessary