voichek / kmersGWAS

A library for running k-mers based GWAS
GNU General Public License v3.0
105 stars 24 forks source link

Not producing phenotype_value.assoc.txt #156

Closed owensgl closed 1 month ago

owensgl commented 1 month ago

I'm running kmergwas through kmergwasflow, although this seems like an issue with kmergwas so I'm asking here.

During the "associate_kmers" step it is creating all the permutations, but is not creating the phenotype_value.assoc.txt file (or a log file to suggest it ran). Here's the start of the log file:

Namespace(fn_phenotype='/project/ctb-grego/owens/rockfish/sex_chromosomes/kmer_analysis/copper/config/sex.pheno', gemma_path='/mnt/owens/rockfish/sex_chromosomes/copper/scripts/external/kmers_gwas/external_programs/gemma_0_96', kmers_len=31, kmers_pattern_counter=False, kmers_table='results/kmers_table/kmers_table', mac=4.0, maf=0.05, min_data_points=20.0, n_extra_phenotype_kmers=None, n_kmers=10001, n_permutations=100, n_snps=10001, outdir='results/kmers_gwas/sex', parallel=8, remove_intermediate=True, run_kmers=True, run_one_step_snps=False, run_two_steps_snps=False, snps_matrix=None, use_kinship_from_kmers=True)Unique phenotype per accession, copying phenotype data to directory Using kinship calculated on k-mersRUN: python2.7 /mnt/owens/rockfish/sex_chromosomes/copper/scripts/external/kmers_gwas/src/py/align_kinship_phenotype.py --pheno results/kmers_gwas/sex/pheno.original_phenotypes --fam_file results/kmers_gwas/sex/pheno_for_accessions_order.fam --kinship_file results/kmers_table/kmers_table.kinship --output_pheno results/kmers_gwas/sex/pheno.phenotypes --output_kinship results/kmers_gwas/sex/pheno.kinship --DBs_list results/kmers_table/kmers_table.names RUN: Rscript /mnt/owens/rockfish/sex_chromosomes/copper/scripts/external/kmers_gwas/src/R/transform_and_permute_phenotypes.R results/kmers_gwas/sex/pheno.phenotypes results/kmers_gwas/sex/pheno.kinship 100 results/kmers_gwas/sex/pheno.phenotypes_and_permutations results/kmers_gwas/sex/pheno.phenotypes_permuted_transformed results/kmers_gwas/sex/EMMA_perm.log > results/kmers_gwas/sex/phenotypes_transformation_permutation.log RUN: mkdir results/kmers_gwas/sex/kmers RUN: /mnt/owens/rockfish/sex_chromosomes/copper/scripts/external/kmers_gwas/bin/associate_kmers -p results/kmers_gwas/sex/pheno.phenotypes_permuted_transformed -b pheno -o results/kmers_gwas/sex/kmers -n 10001 --parallel 8 --kmers_table results/kmers_table/kmers_table --kmer_len 31 --maf 0.050000 --mac 4 2> results/kmers_gwas/sex/associate_kmers.log We have 101 phenotypes RUN: mv results/kmers_gwas/sex/kmers/pheno.0.phenotype_value.fam results/kmers_gwas/sex/kmers/pheno.0.phenotype_value.fam.orig RUN: cat results/kmers_gwas/sex/pheno.phenotypes_and_permutations | tail -n +2 | awk '{print $1 " " $1 " 0 0 0 " $2}' > results/kmers_gwas/sex/kmers/pheno.0.phenotype_value.fam RUNG (1/1): /mnt/owens/rockfish/sex_chromosomes/copper/scripts/external/kmers_gwas/external_programs/gemma_0_96 -bfile results/kmers_gwas/sex/kmers/pheno.0.phenotype_value -lmm 2 -k results/kmers_gwas/sex/pheno.kinship -outdir results/kmers_gwas/sex/kmers/output -o phenotype_value -maf 0.173913 -miss 0.5 RUN: mv results/kmers_gwas/sex/kmers/pheno.1.P1.fam results/kmers_gwas/sex/kmers/pheno.1.P1.fam.orig RUN: cat results/kmers_gwas/sex/pheno.phenotypes_and_permutations | tail -n +2 | awk '{print $1 " " $1 " 0 0 0 " $3}' > results/kmers_gwas/sex/kmers/pheno.1.P1.fam RUNG (2/2): /mnt/owens/rockfish/sex_chromosomes/copper/scripts/external/kmers_gwas/external_programs/gemma_0_96 -bfile results/kmers_gwas/sex/kmers/pheno.1.P1 -lmm 2 -k results/kmers_gwas/sex/pheno.kinship -outdir results/kmers_gwas/sex/kmers/output -o P1 -maf 0.173913 -miss 0.5 RUN: mv results/kmers_gwas/sex/kmers/pheno.2.P2.fam results/kmers_gwas/sex/kmers/pheno.2.P2.fam.orig

I can run the command independently, so when I run "gemma -bfile results/kmers_gwas/sex/kmers/pheno.0.phenotype_value -lmm 2 -k results/kmers_gwas/sex/pheno.kinship -outdir results/kmers_gwas/sex/kmers/output -o phenotype_value -maf 0.173913 -miss 0.5" it works and produces what looks like the correct output. The problem is that the pipeline looks for significant kmers from 'phenotype_value.assoc.txt' and when it doesn't exist, it just thinks there are no significant kmers. My temporary solution is to manually copy in the phenotype_value.assoc.txt into the output directory while it's running so that future steps work, but I'd love to know why it's going wrong.

voichek commented 1 month ago

Dear Gregory,

I actually have no experience working with kmergwasflow, so it is hard for me to interpret what is going on from the output. Can you try to run it without kmergwasflow or start by asking on the their github?

Best, Yoav

voichek commented 1 month ago

Dear Gregory,

Can we close this issue?

Best, Yoav