owkin / PyDESeq2

A Python implementation of the DESeq2 pipeline for bulk RNA-seq DEA.
https://pydeseq2.readthedocs.io/en/latest/
MIT License
573 stars 60 forks source link

Poscount implementation & diag(XXT) optimization #284

Closed asistradition closed 3 months ago

asistradition commented 3 months ago

Thank you for implementing DESeq2 in python. I have been using it for single-cell data.

Included is an implementation of the poscount size_factor method from DESeq2 (R).

Included is an enhancement in irls_solver that replaces the full matrix multiplication to get diag(X(XTX)-1XT) with a more memory and runtime-efficient calculation of only the diagonal values. The result is identical.

asistradition commented 3 months ago

I am also adding an optimization for wald_test that replaces the diagonal weight matrix W with a broadcast vector multiplication

asistradition commented 3 months ago

Done. Changed the poscounts method here to match DESeq2 1.34 to float precision and added unit tests for fit_size_factors.