mhammell-laboratory / TEtranscripts

A package for including transposable elements in differential enrichment analysis of sequencing datasets.
http://hammelllab.labsites.cshl.edu/software/#TEtranscripts
GNU General Public License v3.0
206 stars 29 forks source link

Question about log2FoldChange and lfcSE #130

Closed jordana-olive closed 1 year ago

jordana-olive commented 1 year ago

Hi guys!

I notice that the DE results for log2FoldChange were all positives. It means that all my treatment genes were upregulated, but of course it is almost impossible. Then, I re-run the TEtranscripts inverting my treat and control samples. Then, I had the sames log2FoldChange, but the lfcSE change the negative/positive values, as expected (meaning the opposite up/down considering my controls as treatments). So, in this table, lfcSE is the "log2FoldChange" to use in heatmaps and other visualizations? Usually, in other programs, we use log2FoldChange (with - and + values, down and up). It is just to make sure, thanks all.

my treatment x control

baseMean log2FoldChange lfcSE rnd-6_family-5526:RTE-RTE:LINE 141.616299308822 -1.91199541340198 rnd-6_family-6311:hAT-hATm:DNA 534.516934172654 -0.969475940495322 rnd-6_family-6613:CMC-EnSpm:DNA 152.682955273522 1.60269313891576 rnd-6_family-6766:hAT-Ac:DNA 50.2314332984791 -1.57787065728243 rnd-6_family-695:Helitron:RC 962.435704164596 5.24995285157214 rnd-6_family-7165:MULE-MuDR:DNA 287.580193978893 -1.34956766678082 rnd-6_family-827:PiggyBac:DNA 655.134969973491 5.22110352909539 rnd-6_family-8515:MULE-MuDR:DNA 29.9696577013249 -2.59730485104513 rnd-6_family-925:TcMar-Tc1:DNA 178.460663358603 -1.24871732262877

control as treatment, as expected, the lfcSE is inverted

baseMean log2FoldChange lfcSE rnd-6_family-5526:RTE-RTE:LINE 141.616299308822 1.9119969205912 rnd-6_family-6311:hAT-hATm:DNA 534.516934172654 0.969476688719033 rnd-6_family-6613:CMC-EnSpm:DNA 152.682955273522 -1.60269191417032 rnd-6_family-6766:hAT-Ac:DNA 50.2314332984791 1.57787217150081 rnd-6_family-695:Helitron:RC 962.435704164596 -5.24995096346642 rnd-6_family-7165:MULE-MuDR:DNA 287.580193978893 1.34956856139809 rnd-6_family-827:PiggyBac:DNA 655.134969973491 -5.22110304763185 rnd-6_family-8515:MULE-MuDR:DNA 29.9696577013249 2.59730686899325 rnd-6_family-925:TcMar-Tc1:DNA 178.460663358603 1.24871797598988

olivertam commented 1 year ago

Hi,

Thank you for your interest in the software. If you look at the full table, you will note that there are 6 column names in the first row, but 7 columns for each subsequent entry. This is because the way that R stores a table object is that the rownames are considered its own column without a column header. Thus, the column that is perceived as log2FoldChange is actually baseMean, which should be positive. The column where the numbers flip signs depending on your run is the log2FoldChange column. In fact, if you load this table in R using the following code:

data = read.table("cntTable",sep="\t",row.names=1, header=T)

You will find that the log2FoldChange column will be correctly assigned and thus usable for plotting.

Please let me know if that doesn't address your question. Thanks.

jordana-olive commented 1 year ago

Oh my gosh! Brilliant! Now that make sense. I should pay attention to that. Yes, all the DE usually start with BaseMean. Thank you very much for clarifying that. I hope to cite your paper soon :D

github-actions[bot] commented 1 year ago

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days