LTLA / scRNAseq

Clone of the Bioconductor repository for the scRNAseq package.
http://bioconductor.org/packages/devel/data/experiment/html/scRNAseq.html
24 stars 12 forks source link

Adding data set using data from secondary source #23

Closed twillis209 closed 3 years ago

twillis209 commented 3 years ago

First: thanks for creating and maintaining this package, it's a great help.

I would like to add a data set from Patel et al. 2014 (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4123637/). This is a rather interesting one, comprising:

So far so good, but the problem is that the authors of the original publication only made available log2(TPM+1) values on GEO, not raw counts. For my own purposes, I have been using counts generated by Risso et al. for their publication on ZINB-WaVE (https://www.nature.com/articles/s41467-017-02554-5). These data are currently hosted on a GitHub repo published by Risso for the sake of reproduction of their work on ZINB-Wave. I call them 'secondary' in the sense that they do not originate from the original publication by Patel et al.

Would you accept a pull request adding these count data to the package?

LTLA commented 3 years ago

Funny you say that, because the original purpose of this package was... to serve up count matrices generated by @drisso! Note the ReprocessedAllenData() and friends - I'm sure one could fit in another one of the same nature.

twillis209 commented 3 years ago

Excellent. Davide was kind enough to provide these counts on request in the summer, so I will endeavour to pay that forward.