ggloor / ALDEx_bioc

ALDEx_bioc is the working directory for updating bioconductor
27 stars 13 forks source link

Asymmetric datasets #69

Open vyom84 opened 9 months ago

vyom84 commented 9 months ago

Hi, Thanks for developing ALDEx2, which I have started using for all my analyses instead of the old relative abundance-based analysis. I have a question regarding the asymmetry in the dataset. I have two datasets (three groups) on which I obtained metatranscriptomics data at high depth in two different runs. One dataset run was good, and the majority of the samples got sequencing at more or less the same depth (N=77), but the other second dataset run got something. Some of the samples in the second dataset are sequenced at very high depths, and others are sequenced at low depths (N=60, N=90).

So, I selected only those genes that are present in 70% of both datasets and removed everything else to avoid any further issues due to multiple 0 values in the removed genes. I run ALDEx2 on the dataset, and the effect size does range from -1.7 to +2 with many values at 0).

I was reading the Troubleshooting Datasets section, and I applied various numbers using different log (1.01) values, but the median effect values got even worse (went to 2.2 from 0.3). Any suggestions?

mu.vec = c(log2(rep(1,10)), log2(rep(1.02,10))) scale_samples <- aldex.makeScaleMatrix(gamma=0.25, mu=mu.vec, conditions=blocks)

Thank You