Russel88 / DAtest

Compare different differential abundance and expression methods
GNU General Public License v3.0
49 stars 9 forks source link

Clarification on terminology of "raw counts" vs. "absolute abundance" #14

Closed teyden closed 2 years ago

teyden commented 4 years ago

Hi,

I saw that it is recommended we provide either (1) raw counts & compositional data, or (2) externally normalized or absolute abundance data.

Perhaps I have been using the terminology differently this entire time but my understanding is that raw counts is equivalent to the absolute abundances. Particularly, for microbiome data, where each sample represents a snapshot of the entire microbiome, actual "absolute abundance" is technically impossible. I have come to understand that the raw counts were considered the absolute abundances of the sample, and that relative abundance data represents proportion-normalized abundances.

Could I please get clarification on this? I'd like to make sure I am using the package correctly. Thanks a bunch for all your efforts on the convenience of this package!

Russel88 commented 4 years ago

Hi teyden

I can understand why you get confused. What I mean by absolute abundances are actual absolute abundances. They are not technically impossible if you have external data, such as qPCR or flow cytometry counts. To me it is confusing to call the raw counts absolute abundances, because they are still relative, even though not being normalised or scaled.

So when relative=TRUE DAtest is normalizing/scaling data internally, so raw counts are expected. When relative=FALSE there is no normalization/scaling internally, so absolute abundances, in the true sense, or pre-normalized data is expected.

I hope it makes sense.

Cheers, Jakob

teyden commented 4 years ago

Hi Jakob,

Ah, that makes sense, then. Thanks for clarifying!

On Tue., Jul. 28, 2020, 1:27 p.m. Jakob Russel, notifications@github.com wrote:

Hi teyden

I can understand why you get confused. What I mean by absolute abundances are actual absolute abundances. They are not technically impossible if you have external data, such as qPCR or flow cytometry counts. To me it is confusing to call the raw counts absolute abundances, because they are still relative, even though not being normalised or scaled.

So when relative=TRUE DAtest is normalizing/scaling data internally, so raw counts are expected. When relative=FALSE there is no normalization/scaling internally, so absolute abundances, in the true sense, or pre-normalized data is expected.

I hope it makes sense.

Cheers, Jakob

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/Russel88/DAtest/issues/14#issuecomment-665265396, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABW5VOWP7QZ6ZL3F7IUO3TTR54YB7ANCNFSM4PHIVWXQ .