Very minor issue, but it looks like you're plotting ALL of your associations in your manhattan plots, rather than just the genotype associations. To clarify: when you run your GWAS, you include the top PCs as covariates in the regression (this is correct). But this means that you also get regression results for the covariates, not just the variants you're testing. Take a look at the TEST column in the .assoc.linear output file(s) of the plink --linear command to figure out which results you want to keep/plot.
Also (I didn't take off points for this), but in your boxplot, typically we organize the x-axis as heterozygous samples being in between the two different homozygotes. 1) we often expect the phenotype of heterozygotes to be intermediate between the two homozygous genotypes. Obviously this isn't always the case, such as for mendelian dominant/recessive traits, but it's often a good rule of thumb. 2) When we run our linear regression in the GWAS, what we're actually doing is fitting a line to the points formed when you organize the x-axis this way. The slope of the line is then "how much does the trait change when you add one additional alternative allele.
Pretty plots
4/4
Exercise
Points Possible
Grade
Step 1.2 PC plot
1
1
Step 2.2 AFS plot
1
1
Step 3.2 Manhattan plots
1
1
Step 3.3 effect size boxplot
1
1
Grade
Total: 9.75/10
Great work! Feel free to address the minor issue(s) and resubmit for full credit!
Hi Logan! Thanks for uploading! I've updated the rubric above. Please feel free to resubmit again if you want to fix that one small error. It should hopefully be very quick/easy
README.md
with commands and analyses0/2
plotting.py
script to produce plots3.75/4
Very minor issue, but it looks like you're plotting ALL of your associations in your manhattan plots, rather than just the genotype associations. To clarify: when you run your GWAS, you include the top PCs as covariates in the regression (this is correct). But this means that you also get regression results for the covariates, not just the variants you're testing. Take a look at the
TEST
column in the.assoc.linear
output file(s) of theplink --linear
command to figure out which results you want to keep/plot.Also (I didn't take off points for this), but in your boxplot, typically we organize the x-axis as heterozygous samples being in between the two different homozygotes. 1) we often expect the phenotype of heterozygotes to be intermediate between the two homozygous genotypes. Obviously this isn't always the case, such as for mendelian dominant/recessive traits, but it's often a good rule of thumb. 2) When we run our linear regression in the GWAS, what we're actually doing is fitting a line to the points formed when you organize the x-axis this way. The slope of the line is then "how much does the trait change when you add one additional alternative allele.
Pretty plots
4/4
Grade
Total: 9.75/10
Great work! Feel free to address the minor issue(s) and resubmit for full credit!