Closed WisamSaleem closed 4 years ago
Hi Wisam,
sure :) glmGamPoi is build around the experience of the last ten years from tools mainly designed for bulk RNA-seq: namely edgeR and DESeq/DESeq2. The most relevant papers are:
In addition, the implementation leverages on-disk data with the DelayedArray and the beachmat packages:
Furthermore, I recently started to work on differential testing using the quasi-likelihood framework. The most important paper here is:
In the beginning, I also used
to speed up the inference of the overdispersion parameters. However, I recently refactored the code to make it easier maintainable. The new version uses something similar to a run-length encoding for the counts which brings the same performance boost as Bandara et al.'s formulation.
If you are also interested in the statistical background, I learned a lot about generalized linear models from
For a more high-level overview and introduction to the topic, I would recommend to take a look at the Modern Statistics for Modern Biology by Wolfgang Huber (my boss :D) and Susan Holmes. Chapter 2 specifically talks about handling high-throughput count data.
I hope the above list is helpful, if you have anymore question or a curious about a specific topic, just le me know :)
Great!
Thanks a lot Constantin .
I am working on comparison among some counts data models, i.e. Poisson, quasi and NB with its different varieties like ZINB. I am trying to investigate how good these models for modelling microbial data. I am familiar with DESeq2 and use it quite often.
Wish you a happy weekend
Wisam
From: Constantin notifications@github.com Sent: 29 May 2020 13:03 To: const-ae/glmGamPoi Cc: Wisam Tariq Saleem; Author Subject: Re: [const-ae/glmGamPoi] Research paper (#2)
Hi Wisam,
sure :) glmGamPoi is build around the experience of the last ten years from tools mainly designed for bulk RNA-seq: namely edgeRhttps://bioconductor.org/packages/edgeR/ and DESeq/DESeq2https://bioconductor.org/packages/DESeq2/. The most relevant papers are:
In addition, the implementation leverages on-disk data with the DelayedArrayhttps://bioconductor.org/packages/DelayedArray/ and the beachmathttps://bioconductor.org/packages/beachmat/ packages:
Furthermore, I recently started to work on differential testing using the quasi-likelihood framework. The most important paper here is:
In the beginning, I also used
to speed up the inference of the overdispersion parameters. However, I recently refactored the code to make it easier maintainable. The new version uses something similar to a run-length encoding for the counts which brings the same performance boost as Bandara et al.'s formulation.
If you are also interested in the statistical background, I learned a lot about generalized linear models from
For a more high-level overview and introduction to the topic, I would recommend to take a look at the Modern Statistics for Modern Biologyhttps://www.huber.embl.de/msmb/ by Wolfgang Huber (my boss :D) and Susan Holmes. Chapter 2https://www.huber.embl.de/msmb/Chap-CountData.html specifically talks about handling high-throughput count data.
I hope the above list is helpful, if you have anymore question or a curious about a specific topic, just le me know :)
- You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/const-ae/glmGamPoi/issues/2#issuecomment-635888891, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AIPMGRCXDZMCK2PER4CNGDLRT6B7VANCNFSM4NNYVXDA.
Great, yes I think good comparisons are of high interest. I am not an expert for microbial count data, but from what I have heard, it has some of the same challenges as single cell data, so I would be curious to hear how glmGamPoi
is doing.
I will close this issue for now, but feel free to reopen if anything else comes up.
Best, Constantin
Hi. Could you help me with some references from books or research papers that glmGamPoi is built upon? Thanks