Closed shmohammadi86 closed 5 years ago
Hi shmohammadi86, I am curious about what your research so far shows, as I am just starting to use MAGIC, but have not seen any papers comparing methods, and am also using 10x data. If you could recommend a method for imputation, I would be extremely grateful. Cheers
btw, it says they didn't transform the data so maybe that's part of the cause. If I were to recommend a sanity check, I would say to try before and after.
we L1 norm the data (library size norm) and then usually we do log (or sqrt) transform
we L1 norm the data (library size norm) and then usually we do log (or sqrt) transform
Hi, great work! Nice, precise, detailed and clean paper.
Can I ask you about what criteria do you use for log or sqrt (or none) transform data? When would you consider extreme the distribution of gene expression?
Thanks a lot in advance
@shmohammadi86 I've never used norm-infinity normalization. We (and the field) generally normalize by the sum and not max.
Sqrt is convenient because it doesn't require a pseudocount, e.g. log(x + 0.1) as sqrt(0) = 0
I would always use a log type (including sqrt) transform
Hi,
I am using MAGIC in comparison with a few different methods and I get unsatisfactory results in all benchmarks, which made me wonder if I am doing something wrong. I am masking parts of the expression matrix and predict it using MAGIC, then compare predictions with known values that were masked. I log transformed the expression data (log2(1+x), so all are positive), then norm-infinity normalized the results (so max value in each column is one). I used default parameters ( k = 30; ka = 10; npca = 20), and I increased t from 6 up to 100 (6 is absolutely terrible, by 100 it gets more reasonable). Still, I get bad correlation/relative error for predictions of MAGIC compared to the true values that were masked out in cross-validation. One of the datasets I tried this on is https://support.10xgenomics.com/single-cell-gene-expression/datasets/pbmc4k. Oh, and since my normalization preserves positivity, I rescaled to 99 % (which improved the results partially, but not enough).