frankvogt / vcf2gwas

Python API for comprehensive GWAS analysis using GEMMA
GNU General Public License v3.0
84 stars 29 forks source link

vcf2gwas failing again and again #2

Closed vinod1981 closed 3 years ago

vinod1981 commented 3 years ago

Hi, I recently started exploring vcf2gwas but it is giving an error for which I couldn't find a solution. Command: vcf2gwas -v js.vcf.gz -pf Pheno_GWAS_Cd_test1.csv -ap -cf pca_gwas1.csv -ac -lmm -M 8000 -T 6

Error log: just from the end Filtering and converting files

Converting to PLINK BED.. Error: File read failure. Successfully converted to PLINK BED (Duration: 2 minutes, 16.8 seconds)

Adding phenotypes/covariates to .fam file

Editing .fam file.. All phenotypes chosen Phenotype(s) added to .fam file Traceback (most recent call last): File "/vol/cluster-data/vkumar/miniconda3/envs/myenv/lib/python3.9/site-packages/vcf2gwas/analysis.py", line 238, in covar_file_name = Processing.make_covarfile(fam, pheno_subset2, subset2, Y) File "/vol/cluster-data/vkumar/miniconda3/envs/myenv/lib/python3.9/site-packages/vcf2gwas/utils.py", line 599, in make_covarfile for i in Y: TypeError: 'NoneType' object is not iterable Successfully converted to PLINK BED (Duration: 2 minutes, 23.5 seconds)

Adding phenotypes/covariates to .fam file

Editing .fam file.. All phenotypes chosen Phenotype(s) added to .fam file Traceback (most recent call last): File "/vol/cluster-data/vkumar/miniconda3/envs/myenv/lib/python3.9/site-packages/vcf2gwas/analysis.py", line 238, in covar_file_name = Processing.make_covarfile(fam, pheno_subset2, subset2, Y) File "/vol/cluster-data/vkumar/miniconda3/envs/myenv/lib/python3.9/site-packages/vcf2gwas/utils.py", line 599, in make_covarfile for i in Y: TypeError: 'NoneType' object is not iterable Analysis successfully completed

Summarizing top SNPs.. Traceback (most recent call last): File "/vol/cluster-data/vkumar/miniconda3/envs/myenv/lib/python3.9/site-packages/vcf2gwas/starter.py", line 428, in filenames = Post_analysis.summarizer(path3, path2, pc_prefix3, snp_prefix, n_top, Log, prefix_list) File "/vol/cluster-data/vkumar/miniconda3/envs/myenv/lib/python3.9/site-packages/vcf2gwas/utils.py", line 1002, in summarizer for file in os.listdir(path): FileNotFoundError: [Errno 2] No such file or directory: '/vol/cluster-data/vkumar/miniconda3/bin/output/lmm/summary/top_SNPs'

The problem is that there is a folder generated with name "lm" but I used "lmm" in the command and now it is searching for lmm folder in the path.

Then I changed the one sub-directory manually to lmm from lm and error shortened like this:

Preparing files

Checking and adjusting files.. Checking individuals in VCF file.. Checking individuals in phenotype file.. Not all individuals in phenotpye and genotype file match Removed 0 out of 810 individuals, 810 remaining Checking individuals in covariate file.. All covariate and genotype individuals match Removed 0 out of 822 individuals, 822 remaining In total, removed 12 out of 822 individuals, 810 remaining Files successfully adjusted

Filtering and converting files

Converting to PLINK BED.. Successfully converted to PLINK BED (Duration: 35.6 seconds)

Adding phenotypes/covariates to .fam file

Editing .fam file.. Phenotype(s) added to .fam file Traceback (most recent call last): File "/vol/cluster-data/vkumar/miniconda3/envs/myenv/lib/python3.9/site-packages/vcf2gwas/analysis.py", line 238, in covar_file_name = Processing.make_covarfile(fam, pheno_subset2, subset2, Y) File "/vol/cluster-data/vkumar/miniconda3/envs/myenv/lib/python3.9/site-packages/vcf2gwas/utils.py", line 599, in make_covarfile for i in Y: TypeError: 'NoneType' object is not iterable Analysis successfully completed

Summarizing top SNPs.. Couldn't find files to summarize! Clean up successful

vcf2gwas has been successfully completed! Runtime: 7 minutes, 43.9 seconds

This error is still there. What to do?

Thanks,

Vinod

frankvogt commented 3 years ago

Hi Vinod,

Thank you for your email! Would you mind sending me your input files so that I can replicate the error?

Thanks,

Frank

Am Do., 24. Juni 2021 um 22:47 Uhr schrieb vinod1981 < @.***>:

Hi, I recently started exploring vcf2gwas but it is giving an error for which I couldn't find a solution. Command: vcf2gwas -v js.vcf.gz -pf Pheno_GWAS_Cd_test1.csv -ap -cf pca_gwas1.csv -ac -lmm -M 8000 -T 6

Error log: just from the end Filtering and converting files

Converting to PLINK BED.. Error: File read failure. Successfully converted to PLINK BED (Duration: 2 minutes, 16.8 seconds)

Adding phenotypes/covariates to .fam file

Editing .fam file.. All phenotypes chosen Phenotype(s) added to .fam file Traceback (most recent call last): File "/vol/cluster-data/vkumar/miniconda3/envs/myenv/lib/python3.9/site-packages/vcf2gwas/analysis.py", line 238, in covar_file_name = Processing.make_covarfile(fam, pheno_subset2, subset2, Y) File "/vol/cluster-data/vkumar/miniconda3/envs/myenv/lib/python3.9/site-packages/vcf2gwas/utils.py", line 599, in make_covarfile for i in Y: TypeError: 'NoneType' object is not iterable Successfully converted to PLINK BED (Duration: 2 minutes, 23.5 seconds)

Adding phenotypes/covariates to .fam file

Editing .fam file.. All phenotypes chosen Phenotype(s) added to .fam file Traceback (most recent call last): File "/vol/cluster-data/vkumar/miniconda3/envs/myenv/lib/python3.9/site-packages/vcf2gwas/analysis.py", line 238, in covar_file_name = Processing.make_covarfile(fam, pheno_subset2, subset2, Y) File "/vol/cluster-data/vkumar/miniconda3/envs/myenv/lib/python3.9/site-packages/vcf2gwas/utils.py", line 599, in make_covarfile for i in Y: TypeError: 'NoneType' object is not iterable Analysis successfully completed

Summarizing top SNPs.. Traceback (most recent call last): File "/vol/cluster-data/vkumar/miniconda3/envs/myenv/lib/python3.9/site-packages/vcf2gwas/starter.py", line 428, in filenames = Post_analysis.summarizer(path3, path2, pc_prefix3, snp_prefix, n_top, Log, prefix_list) File "/vol/cluster-data/vkumar/miniconda3/envs/myenv/lib/python3.9/site-packages/vcf2gwas/utils.py", line 1002, in summarizer for file in os.listdir(path): FileNotFoundError: [Errno 2] No such file or directory: '/vol/cluster-data/vkumar/miniconda3/bin/output/lmm/summary/top_SNPs'

Thanks,

Vinod

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/frankvogt/vcf2gwas/issues/2, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANK44MADLMED3SVBONXAY4DTUOKW7ANCNFSM47ITYAGQ .

vinod1981 commented 3 years ago

Hi Frank, Thanks for reaching out. But the vcf file is very big around 6G compressed. How to do that? Thanks,

Vinod,

frankvogt commented 3 years ago

Hi Vinod,

Maybe you can share it via google drive or something similar? Alternatively you can send me just the phenotype + covariate + log files.

Best,

Frank

Am Fr., 25. Juni 2021 um 16:27 Uhr schrieb vinod1981 < @.***>:

Hi Frank, Thanks for reaching out. But the vcf file is very big around 6G compressed. How to do that? Thanks,

Vinod,

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/frankvogt/vcf2gwas/issues/2#issuecomment-868539886, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANK44MB5LHEZOZUL53UITU3TUSG6HANCNFSM47ITYAGQ .

vinod1981 commented 3 years ago

Hi Frank, Should I share them here or on a specific email? Thanks, VInod,

vinod1981 commented 3 years ago

Find the attached files.

Vinod,

Sent from Mail for Windows 10

From: Frank Vogt Sent: Friday, 25 June 2021 16:39 To: frankvogt/vcf2gwas Cc: vinod1981; Author Subject: Re: [frankvogt/vcf2gwas] vcf2gwas failing again and again (#2)

Hi Vinod,

Maybe you can share it via google drive or something similar? Alternatively you can send me just the phenotype + covariate + log files.

Best,

Frank

Am Fr., 25. Juni 2021 um 16:27 Uhr schrieb vinod1981 < @.***>:

Hi Frank, Thanks for reaching out. But the vcf file is very big around 6G compressed. How to do that? Thanks,

Vinod,

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/frankvogt/vcf2gwas/issues/2#issuecomment-868539886, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANK44MB5LHEZOZUL53UITU3TUSG6HANCNFSM47ITYAGQ .

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

vcf2gwas v0.5

Initialising..

Start time: Fri, 25 Jun 2021 16:52:49

Parsing arguments.. Genotype file: output_all_imputed_samples_deleted_js_GT1.vcf.gz Phenotype file(s): pca_gwas_indPheno.csv Covariate file: pca_gwas1.csv Arguments parsed successfully

Preparing files

Checking pca_gwas_indPheno.csv..

Filtering SNPs.. Indexing VCF file.. VCF file successfully indexed (Duration: 24.3 seconds) SNPs successfully filtered (Duration: 3 minutes, 1.2 seconds)

File preparations completed

Starting analysis.. Analysis successfully completed

Summarizing top SNPs.. Couldn't find files to summarize! Clean up successful

vcf2gwas has been successfully completed! Runtime: 7 minutes, 44.9 seconds

Output directory: /vol/cluster-data/vkumar/miniconda3/bin/output

Phenotypes analyzed in total: 1

Input:

Files:

GEMMA parameters:

Options: --memory 8000 --threads 6 --allcovariates

Beginning with analysis of pca_gwas_indPheno.csv

Preparing files

Checking and adjusting files.. Checking individuals in VCF file.. Checking individuals in phenotype file.. Not all individuals in phenotpye and genotype file match Removed 0 out of 810 individuals, 810 remaining Checking individuals in covariate file.. All covariate and genotype individuals match Removed 0 out of 822 individuals, 822 remaining In total, removed 12 out of 822 individuals, 810 remaining Files successfully adjusted

Filtering and converting files

Converting to PLINK BED.. Successfully converted to PLINK BED (Duration: 36.2 seconds)

Adding phenotypes/covariates to .fam file

Editing .fam file.. Phenotype(s) added to .fam file

frankvogt commented 3 years ago

You can share the files with my email: frvogt@gmail.com

frankvogt commented 3 years ago

Bug was fixed where internally not all covariates were passed on when a certain combination of options was selected

vinod1981 commented 3 years ago

Hi Frank, Is it a reply for my other request or you are adding something to this issue? Because in other issue, I am requesting if it is possible to just see the significant threshold line and not label the SNPs in Manhattan plot. Thanks, Vinod,

frankvogt commented 3 years ago

No this was just to clarify what the issue was and that it has been fixed