Open jonathanfischer97 opened 1 month ago
Better answers in your README
, but you forgot to actually plot the Log2 Enrichment after you made the transformation in your plot! You were still plotting "Enrichment" as your y-axis. Stupid mistake I assume, so will split the difference and give you an extra 0.25 on the plot.
Total score: 9.75/10
Good job overall!
Grading Assessment for QB Week 2
Part 1: Bash Script for Bedtools Commands (2.5 pts)
Your script correctly implements the necessary
bedtools
commands for sorting, merging, and subtracting the various feature files. You usedsortBed
andmergeBed
to process the gene, exon, and cCRE files, and correctly created the intron and "other" bed files usingsubtractBed
. Everything was done efficiently, though one small improvement could be to handle temporary files more efficiently (e.g., deleting intermediate files), but this was not necessary for full credit.Score: 2.5/2.5
Part 2: SNP Enrichment Calculation and Analysis (7.5 pts)
2.1 Shell Script for Calculating Enrichments (4.5 pts)
Your script is well-written and correctly loops through the MAF and feature files using
bedtools coverage
to calculate the enrichment values. You successfully usedawk
andbc
to compute SNP density and enrichment values. The results were appropriately written to the output filesnp_counts.txt
.Score: 4.5/4.5
2.2 Text File with SNP Enrichments (0.5 pts)
The file
snp_counts.txt
contains all the necessary combinations of MAF and feature values, with calculated enrichments in a well-organized, tab-separated format. Everything was done correctly.Score: 0.5/0.5
2.3 Plot from Step 2.4 (1.5 pts)
Your plot is clear, with appropriate axis labels, legends, and distinct lines for each genomic feature. However, your log2 transformation of the y-axis was slightly incorrect. It looks like you attempted to apply the log2 transformation in your plot by using
scale_y_continuous(trans = "log2")
. However, this only transforms the y-axis, not the data itself, which can cause issues when plotting if there are zero or near-zero values in the data (since log2 of zero is undefined).To correctly transform and display the log2 enrichment values, you should apply the log2 transformation directly to the data before plotting. Here's what you should have done:
Score: 1.0/1.5
2.4 Answers to Questions in README.md (1.0 pts)
You correctly identify key points in your answers, such as exons being under purifying selection. However, your explanations lack detail. For instance, in Question 1, mentioning the log2 enrichment values and their relationship to purifying selection would have strengthened your response. In Question 2, the idea of natural selection affecting allele frequencies was touched upon but not fully explained. More detail overall would enhance the clarity of your understanding.
Score: 0.5/1.0
Total Score: 9.0/10