It looks like the way you're calculating the Jaccard index isn't quite right. You're calculating sig_in_homemade_only / sig_in_both. What we want to be calculating is sig_in_both / sig_in_either. You've used .intersection() correctly to get the number of genes significantly DE in both analyses, I would recommend .union() to find the number of genes significantly DE in either analysis.
We were also expecting you to define DESeq2 significant genes based only on the padj and leave the log2FC cutoff for the volcano plot, but I'm not going to take off points for that.
Also for your volcano plot (and actually also your Jaccard index, the way you're doing it), it looks like you're setting your log2FC cutoff as genes whose abs(log2FC) is less than 1. We want genes whose abs(log2FC) is greater than 1, because these are the genes with the biggest differences between sexes.
README.md with answers to questions
1/1
Exercise
Points Possible
Grade
Jaccard index overlap between methods
1
1
Output text files
2/2
Exercise
Points Possible
Grade
List of DE genes in manual test
1
1
List of DE genes in PyDESeq2 test
1
1
Pretty plots
1/1
Exercise
Points Possible
Grade
Exercise 2 Volcano plot
1
1
Grade
Total: 9.5/10
Great work! Feel free to address the two minor issues and resubmit!
Python script to run DE analysis
5.5/6
It looks like the way you're calculating the Jaccard index isn't quite right. You're calculating
sig_in_homemade_only / sig_in_both
. What we want to be calculating issig_in_both / sig_in_either
. You've used.intersection()
correctly to get the number of genes significantly DE in both analyses, I would recommend.union()
to find the number of genes significantly DE in either analysis.We were also expecting you to define DESeq2 significant genes based only on the
padj
and leave the log2FC cutoff for the volcano plot, but I'm not going to take off points for that.Also for your volcano plot (and actually also your Jaccard index, the way you're doing it), it looks like you're setting your log2FC cutoff as genes whose abs(log2FC) is less than 1. We want genes whose abs(log2FC) is greater than 1, because these are the genes with the biggest differences between sexes.
README.md
with answers to questions1/1
Output text files
2/2
Pretty plots
1/1
Grade
Total: 9.5/10
Great work! Feel free to address the two minor issues and resubmit!