const-ae / glmGamPoi

Fit Gamma-Poisson Generalized Linear Models Reliably
105 stars 15 forks source link

Research paper #2

Closed WisamSaleem closed 4 years ago

WisamSaleem commented 4 years ago

Hi. Could you help me with some references from books or research papers that glmGamPoi is built upon? Thanks

const-ae commented 4 years ago

Hi Wisam,

sure :) glmGamPoi is build around the experience of the last ten years from tools mainly designed for bulk RNA-seq: namely edgeR and DESeq/DESeq2. The most relevant papers are:

In addition, the implementation leverages on-disk data with the DelayedArray and the beachmat packages:

Furthermore, I recently started to work on differential testing using the quasi-likelihood framework. The most important paper here is:

In the beginning, I also used

to speed up the inference of the overdispersion parameters. However, I recently refactored the code to make it easier maintainable. The new version uses something similar to a run-length encoding for the counts which brings the same performance boost as Bandara et al.'s formulation.

If you are also interested in the statistical background, I learned a lot about generalized linear models from

For a more high-level overview and introduction to the topic, I would recommend to take a look at the Modern Statistics for Modern Biology by Wolfgang Huber (my boss :D) and Susan Holmes. Chapter 2 specifically talks about handling high-throughput count data.


I hope the above list is helpful, if you have anymore question or a curious about a specific topic, just le me know :)

WisamSaleem commented 4 years ago

Great!

Thanks a lot Constantin .

I am working on comparison among some counts data models, i.e. Poisson, quasi and NB with its different varieties like ZINB. I am trying to investigate how good these models for modelling microbial data. I am familiar with DESeq2 and use it quite often.

Wish you a happy weekend

Wisam


From: Constantin notifications@github.com Sent: 29 May 2020 13:03 To: const-ae/glmGamPoi Cc: Wisam Tariq Saleem; Author Subject: Re: [const-ae/glmGamPoi] Research paper (#2)

Hi Wisam,

sure :) glmGamPoi is build around the experience of the last ten years from tools mainly designed for bulk RNA-seq: namely edgeRhttps://bioconductor.org/packages/edgeR/ and DESeq/DESeq2https://bioconductor.org/packages/DESeq2/. The most relevant papers are:

In addition, the implementation leverages on-disk data with the DelayedArrayhttps://bioconductor.org/packages/DelayedArray/ and the beachmathttps://bioconductor.org/packages/beachmat/ packages:

Furthermore, I recently started to work on differential testing using the quasi-likelihood framework. The most important paper here is:

In the beginning, I also used

to speed up the inference of the overdispersion parameters. However, I recently refactored the code to make it easier maintainable. The new version uses something similar to a run-length encoding for the counts which brings the same performance boost as Bandara et al.'s formulation.

If you are also interested in the statistical background, I learned a lot about generalized linear models from

For a more high-level overview and introduction to the topic, I would recommend to take a look at the Modern Statistics for Modern Biologyhttps://www.huber.embl.de/msmb/ by Wolfgang Huber (my boss :D) and Susan Holmes. Chapter 2https://www.huber.embl.de/msmb/Chap-CountData.html specifically talks about handling high-throughput count data.


I hope the above list is helpful, if you have anymore question or a curious about a specific topic, just le me know :)

- You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/const-ae/glmGamPoi/issues/2#issuecomment-635888891, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AIPMGRCXDZMCK2PER4CNGDLRT6B7VANCNFSM4NNYVXDA.

const-ae commented 4 years ago

Great, yes I think good comparisons are of high interest. I am not an expert for microbial count data, but from what I have heard, it has some of the same challenges as single cell data, so I would be curious to hear how glmGamPoi is doing.

I will close this issue for now, but feel free to reopen if anything else comes up.

Best, Constantin