ggloor / ALDEx2_dev

ALDEx tool to examine compositional high-throughput sequence data with Welch's t-test
GNU Affero General Public License v3.0
12 stars 6 forks source link

ALDEx2 clr transformation #23

Closed miguensblanco closed 2 years ago

miguensblanco commented 3 years ago

Hi,

It is possibel to picked up the data frame from aldex.clr function and use it in different downstream analysis? My concern is because to do so, I remove the conds argument since I do not want to introduce any metadata and that rises to warnings: no conditions provided: forcing denom = 'all' no conditions provided: forcing conds = 'NA'

I think it is totally fine if I accept that it can just use all as denom.

Thanks and great tool!!

JMB

ggloor commented 3 years ago

Hi, it is certainly possible, and easy to do what you want. The conds argument with denom=all will give you the data that you want. There are a number of getters for this and you might find the following helpful (i think you want slot 8)

The aldex.clr function outputs a data structure that summarizes the Monte-Carlo replicates in 8 slots.

Slot 1 reads contains a data frame of the initial read data that has features with 0 counts across all samples removed. Additionally, the original data frame contains a prior of 0.5, that is, 0.5 has been added to all values.

Slot 2 conds contains the conditions of the experiment. This slot is a vector when a simple pairwise comparison is conducted, and a model matrix when a glm is being conducted.

Slot 3 mc.samples contains an integer with the number of Monte-Carlo replicates.

Slot 4 denom contains a vector with the offset of the features used for the denominator of the log-ratio. This is all the features when a centred-log ratio is used, and is a subset otherwise. The user can supply their own offsets if desired when aldex.clr is run.

Slot 5 verbose contains is a logical indicating whether the function was run in verbose mode.

Slot 6 useMC is a logical indicating whether the function was run in multi-core mode.

Slot 7 dirichletData is a list, where each element of the list contains a matrix of the Dirichlet replicates for each sample. Features are by row, and Dirichlet Monte-Carlo replicates are by column.

- get the the entirety of slot 7 with `getDirichletInstances`
- get the MC instances for each sample of slot 7 with `getDirichletReplicate`
- get an individual MC instance of slot 7 across with `getDirichletSample`

Slot 8 analysisData is a list, where each element of the list contains a matrix of the log-ratio transformed Dirichlet replicates for each sample. Features are by row, and Dirichlet Monte-Carlo replicates are by column.

- get the the entirety of slot 8 with `getMonteCarloInstances(x)`
- get the MC instances for each sample of slot 8 with `getMonteCarloReplicate`
- get an individual MC instance of slot 8 across with `getMonteCarloSample`