Question 1.1 +0.5 pt albeit an unusual way to do it, assuming all read lengths are the same, you can obtain just the second line using head -2| tail -1 and then pipe that into wc -m
Question 1.2 +0.5 pt use grep -v "^>" to make sure you exclude lines that start with ">"
Question 1.3 +0.5 pt
Question 1.4 +0.5 pt
Question 1.5 +0.5 pt format the bash command to run fastqc on ALL fastq files
Score: 2.5/2.5 pt
Part 2: Bash script for Exercise 2 (3 pt)
Evalutation:
Question 2.1 +0.5 pt One of the chromosomes is mitochondrial, hence the odd number
Question 2.2 +0.5 pt
Question 2.3 +0.25 pt should grep -v "^@" first to remove header lines, which also include instances of "chrIII" and inflate count
Question 2.4 +0.5 pt
Question 2.5 +0.5 pt
Question 2.6 +0.5 pt
Score: 2.75/3 pt
Part 3: Python script for Exercise 3 (2.5 pt)
Evaluation:
-Overall, the python script and strategies used for parsing the file look good!
-0.25 pt, read depth will not be in the same column as allele frequency (fields[7]), so your DP file may have had wrong numbers or be empty
Score: 2.25/2.5 pt
Part 4: R script and figures for Exercise 3 (2 pt)
Evaluation:
AF histogram looks good
-0.25pt unable to plot or visualize DP histogram, possibly due to reasons stated in Part 3. What did you see when you viewed DP? What kind of errors did the ggplot function return? These may also help you address plotting in the future.
Part 1: Bash script for Exercise 1 (2.5 pt)
Evaluation:
Score: 2.5/2.5 pt
Part 2: Bash script for Exercise 2 (3 pt)
Evalutation:
Score: 2.75/3 pt
Part 3: Python script for Exercise 3 (2.5 pt)
Evaluation:
-Overall, the python script and strategies used for parsing the file look good! -0.25 pt, read depth will not be in the same column as allele frequency (fields[7]), so your DP file may have had wrong numbers or be empty
Score: 2.25/2.5 pt
Part 4: R script and figures for Exercise 3 (2 pt)
Evaluation:
Score: 1.75/2 pt
Total score: 9.25/10 pt