genenetwork / genenetwork2

GeneNetwork (2nd generation)
http://gn2.genenetwork.org/
GNU Affero General Public License v3.0
34 stars 24 forks source link

GTEXv5 Human Brain Frontal Cortex genotypes versus COMT expression #269

Closed dannemil closed 5 years ago

dannemil commented 6 years ago

I am running into this error when I run PLINK. The minor allele threshold is 0.05. I am trying to regress COMT expression in frontal cortex against the genotypes (allelic dose) in frontal cortex from the database: GTEXv5 Human Brain Frontal Cortex BA9 RefSeq (Sep15) RPKM log2. In the dialog box that appears below the PLINK computation I have hidden no-values and blocked outliers. Here is the trace:

GeneNetwork penguin:gene:2.10rc3-production-d692e4f79 http://gn2.genenetwork.org/marker_regression ( 2:01PM UTC Dec 15, 2017) Traceback (most recent call last): File "/usr/local/guix-profiles/gn2-2.10rc4/lib/python2.7/site-packages/flask/app.py", line 1639, in full_dispatch_request rv = self.dispatch_request() File "/usr/local/guix-profiles/gn2-2.10rc4/lib/python2.7/site-packages/flask/app.py", line 1625, in dispatch_request return self.view_functionsrule.endpoint File "/home/production/gene/wqflask/wqflask/views.py", line 621, in marker_regression_page template_vars = marker_regression.MarkerRegression(start_vars, temp_uuid) File "/home/production/gene/wqflask/wqflask/marker_regression/marker_regression.py", line 231, in init results = plink_mapping.run_plink(self.this_trait, self.dataset, self.species, self.vals, self.maf) File "/home/production/gene/wqflask/wqflask/marker_regression/plink_mapping.py", line 23, in run_plink count, p_values = parse_plink_output(plink_output_filename, species) File "/home/production/gene/wqflask/wqflask/marker_regression/plink_mapping.py", line 109, in parse_plink_output result_fp = open("%s/%s.qassoc"% (TMPDIR, output_filename), "rb") IOError: [Errno 2] No such file or directory: u'/home/production/tmp/gn2//GTEx_v5_ENSG00000093010.7_5pM91ZGB.qassoc'

robwwilliams commented 6 years ago

Dear Dannemil, Zach, Lei, Pjotr

Testing first using PLINK in GeneNetwork 1, just for baseline comparison with what GN2 should be able to do, just much better.

[image: Inline image 1]

N = 115 cases using GTEx v5

Distribution of values highlight four high outliers and three low [image: Inline image 2]

outliers. All seven were winsorized prior to attempted PLINK analysis.

Started PLINK compute at 08:50 AM

[image: Inline image 3]

Run took 2584 seconds (43 minutes! ouch) on an EC2 instance ( ip-10-183-80-244.ec2.internal) [image: Inline image 1] The returned table underneath the Manhattan plot includes LOD scores for 8000+ SNPs.

Nothing significant on Chr 22 (cis), Hit on proximal 12 is nominally significant with -logP of 8.16, but PLINK run did not correct for population structure etc.

On Fri, Dec 15, 2017 at 8:04 AM, dannemil notifications@github.com wrote:

I am running into this error when I run PLINK. The minor allele threshold is 0.05. I am trying to regress COMT expression in frontal cortex against the genotypes (allelic dose) in frontal cortex from the database: GTEXv5 Human Brain Frontal Cortex BA9 RefSeq (Sep15) RPKM log2. In the dialog box that appears below the PLINK computation I have hidden no-values and blocked outliers. Here is the trace:

GeneNetwork penguin:gene:2.10rc3-production-d692e4f79 http://gn2.genenetwork.org/marker_regression ( 2:01PM UTC Dec 15, 2017) Traceback (most recent call last): File "/usr/local/guix-profiles/gn2-2.10rc4/lib/python2.7/site-packages/flask/app.py", line 1639, in full_dispatch_request rv = self.dispatch_request() File "/usr/local/guix-profiles/gn2-2.10rc4/lib/python2.7/site-packages/flask/app.py", line 1625, in dispatch_request return self.view_functionsrule.endpoint http://**req.view_args File "/home/production/gene/wqflask/wqflask/views.py", line 621, in marker_regression_page template_vars = marker_regression.MarkerRegression(start_vars, tempuuid) File "/home/production/gene/wqflask/wqflask/marker regression/marker_regression.py", line 231, in init results = plink_mapping.run_plink(self.this_trait, self.dataset, self.species, self.vals, self.maf) File "/home/production/gene/wqflask/wqflask/marker_regression/plink_mapping.py", line 23, in run_plink count, p_values = parse_plink_output(plink_output_filename, species) File "/home/production/gene/wqflask/wqflask/marker_regression/plink_mapping.py", line 109, in parse_plink_output result_fp = open("%s/%s.qassoc"% (TMPDIR, output_filename), "rb") IOError: [Errno 2] No such file or directory: u'/home/production/tmp/gn2// GTEx_v5_ENSG00000093010.7_5pM91ZGB.qassoc'

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/genenetwork/genenetwork2/issues/269, or mute the thread https://github.com/notifications/unsubscribe-auth/ALva_DW5O9LLG6rBCLFDOTiFjNP94l5sks5tAnyDgaJpZM4RDhaZ .

-- Rob

Robert W. Williams, Ph.D. Chair: Department of Genetics, Genomics and Informatics 71 S Manassas St, Memphis TN 38163 University of Tennessee Health Science Center Office 901 448-7050 CELL 901 604 4752 Office: Translational Science Research Building, Room 407 EMAIL: rwilliams@uthsc.edu Alternative email: labwilliams@gmail.com SKYPE: robwwilliams

zsloan commented 6 years ago

Ah, I'll fix this. If the problem is what I think it is, it should be fairly simple.

On Fri, Dec 15, 2017 at 10:22 AM, robwwilliams notifications@github.com wrote:

Dear Dannemil, Zach, Lei, Pjotr

Testing first using PLINK in GeneNetwork 1, just for baseline comparison with what GN2 should be able to do, just much better.

[image: Inline image 1]

N = 115 cases using GTEx v5

Distribution of values highlight four high outliers and three low [image: Inline image 2]

outliers. All seven were winsorized prior to attempted PLINK analysis.

Started PLINK compute at 08:50 AM

[image: Inline image 3]

Run took 2584 seconds (43 minutes! ouch) on an EC2 instance ( ip-10-183-80-244.ec2.internal) [image: Inline image 1] The returned table underneath the Manhattan plot includes LOD scores for 8000+ SNPs.

Nothing significant on Chr 22 (cis), Hit on proximal 12 is nominally significant with -logP of 8.16, but PLINK run did not correct for population structure etc.

On Fri, Dec 15, 2017 at 8:04 AM, dannemil notifications@github.com wrote:

I am running into this error when I run PLINK. The minor allele threshold is 0.05. I am trying to regress COMT expression in frontal cortex against the genotypes (allelic dose) in frontal cortex from the database: GTEXv5 Human Brain Frontal Cortex BA9 RefSeq (Sep15) RPKM log2. In the dialog box that appears below the PLINK computation I have hidden no-values and blocked outliers. Here is the trace:

GeneNetwork penguin:gene:2.10rc3-production-d692e4f79 http://gn2.genenetwork.org/marker_regression ( 2:01PM UTC Dec 15, 2017) Traceback (most recent call last): File "/usr/local/guix-profiles/gn2-2.10rc4/lib/python2.7/site- packages/flask/app.py", line 1639, in full_dispatch_request rv = self.dispatch_request() File "/usr/local/guix-profiles/gn2-2.10rc4/lib/python2.7/site- packages/flask/app.py", line 1625, in dispatch_request return self.view_functionsrule.endpoint http://**req.view_args File "/home/production/gene/wqflask/wqflask/views.py", line 621, in marker_regression_page template_vars = marker_regression.MarkerRegression(start_vars, tempuuid) File "/home/production/gene/wqflask/wqflask/marker regression/marker_regression.py", line 231, in init results = plink_mapping.run_plink(self.thistrait, self.dataset, self.species, self.vals, self.maf) File "/home/production/gene/wqflask/wqflask/marker regression/plink_mapping.py", line 23, in run_plink count, p_values = parse_plink_output(plink_outputfilename, species) File "/home/production/gene/wqflask/wqflask/marker regression/plink_mapping.py", line 109, in parse_plink_output result_fp = open("%s/%s.qassoc"% (TMPDIR, output_filename), "rb") IOError: [Errno 2] No such file or directory: u'/home/production/tmp/gn2// GTEx_v5_ENSG00000093010.7_5pM91ZGB.qassoc'

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/genenetwork/genenetwork2/issues/269, or mute the thread https://github.com/notifications/unsubscribe-auth/ALva_ DW5O9LLG6rBCLFDOTiFjNP94l5sks5tAnyDgaJpZM4RDhaZ .

-- Rob

Robert W. Williams, Ph.D. Chair: Department of Genetics, Genomics and Informatics 71 S Manassas St, Memphis TN 38163 https://maps.google.com/?q=71+S+Manassas+St,+Memphis+TN+38163&entry=gmail&source=g University of Tennessee Health Science Center Office 901 448-7050 <(901)%20448-7050> CELL 901 604 4752 <(901)%20604-4752> Office: Translational Science Research Building, Room 407 EMAIL: rwilliams@uthsc.edu Alternative email: labwilliams@gmail.com SKYPE: robwwilliams

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/genenetwork/genenetwork2/issues/269#issuecomment-352047735, or mute the thread https://github.com/notifications/unsubscribe-auth/ABsEmCKSxgFtzP7FwIkkCMhkVw_mliMkks5tApzWgaJpZM4RDhaZ .

pjotrp commented 6 years ago

@dannemil can you confirm this works again?

pjotrp commented 6 years ago

Can we confirm this works now?

zsloan commented 6 years ago

This does not work; I'm taking a look at it now. The issue seems to be related to the format of the .fam file.

On Tue, Feb 13, 2018 at 2:09 AM, Pjotr Prins notifications@github.com wrote:

Can we confirm this works now?

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/genenetwork/genenetwork2/issues/269#issuecomment-365181730, or mute the thread https://github.com/notifications/unsubscribe-auth/ABsEmAmVAJLYr4ulB4gdPK4vowVuN6N7ks5tUUNBgaJpZM4RDhaZ .

zsloan commented 5 years ago

Going to close this issue since GN2 no longer uses PLINK and this issue is no longer relevant.

pjotrp commented 5 years ago

Thanks @zsloan for going through these issues!