hbctraining / DGE_workshop_salmon_online

https://hbctraining.github.io/DGE_workshop_salmon_online/
163 stars 75 forks source link

cnet plot size warning/legend, lesson text does not match plot output #48

Closed hwick closed 2 months ago

hwick commented 4 months ago

https://hbctraining.github.io/DGE_workshop_salmon_online/lessons/10_FA_over-representation_analysis.html

When working with my own data set, when running the cnetplot code I get this warning:

Scale for size is already present.
Adding another scale for size, which will replace the existing scale.

I noticed that the legend for size is clearly not pvalue (integer numbers > 1), which is what the instructions say. Similarly, the example plot in the lesson has integers > 1, so they clearly aren't p values either. When looking at the help page for cnet plot, categorySize isn't a listed argument. If I plot without this argument, the plot looks exactly the same (and oddly, I still get the warning). The "size" appears to represent the number of significant genes in the GO term.

Additionally there is this warning:

Warning message:
In cnetplot.enrichResult(x, ...) :
  Use 'color.params = list(foldChange = your_value)' instead of 'foldChange'.
 The foldChange parameter will be removed in the next version.

It sounds like aspects of this function have changed since these lessons were written and need updating. Regardless, the text as is is currently inaccurate even with the current plot image:

"Finally, the category netplot shows the relationships between the genes associated with the top five most significant GO terms and the fold changes of the significant genes associated with these terms (color). The size of the GO terms reflects the pvalues of the terms, with the more significant terms being larger. This plot is particularly useful for hypothesis generation in identifying genes that may be important to several of the most affected processes."

This should be changed to reflect that node size actually represents the number of significant genes in the GO terms.

hwick commented 2 months ago

Update: Sadly, the categorySize option appears to be deprecated, and to be honest I'm not sure it ever worked. I have commented on an existing, open issue for this function here on the author's github. However the issue has been open since 2022. The author did respond but not address the actual issue when it was first opened. It appears based on their own learning material that the argument never worked as intended: there is no discernible difference when they specify categorySize="pvalue" or not in their own examples, and the legend does not change to reflect that, either.

I have updated the course material to remove this argument, fix the foldchange argument which elicits a warning, and to change the language to reflect that the node size appears to reflect gene number and not p value