Open grst opened 5 years ago
Thanks @grst for the great problem description, I am looking into this. This is probably linked to us changing the default backend to numpy-based optimizers.
Btw, the example data is also available here as tsv
files, it's probably easier for you than running the entire pipeline: https://github.com/grst/benchmark-single-cell-de-analysis/tree/master/diffxpy_test
Also, I have the impression that 0.7.1 runs significantly slower than 0.6.13. Is that something you can confirm?
Hi,
while experimenting with diffxpy, I noticed that the results changed since I upgraded from v0.6.13 to v0.7.1. Is that intentional?
The setup:
I added diffxpy to the DE benchmark by Van den Berge 2019.
Under v0.6.13, diffxpy
wald_test
withnb
noise-model produces results highly comparable toedgeR
or a NB-model from python statsmodels:True positive and false positive rate on simulated data at an FDR-cutoff of 0.05:
However, under v0.7.1, the FDR is significantly inflated for
diffxpy
:Availability
The full analysis reports are available here:
The analysis is available at https://github.com/grst/benchmark-single-cell-de-analysis/. Everything is wrapped in a nextflow pipeline that uses conda envs. Simply running
nextflow run ./benchmark.nf
should reproduce the above reports.