GreenleafLab / chromVAR

chromatin Variability Across Regions (of the genome!)
https://greenleaflab.github.io/chromVAR/
Other
159 stars 38 forks source link

GSE99172 data cell-by-peaks and labels #55

Open znavidi opened 5 years ago

znavidi commented 5 years ago

Hi,

Thank for your practical tool! I am working on single cell ATAC-seq data and I would like to classify the cells based on their cell types. It seems the only single cell ATAC seq dataset with ground truth label is GSE99172 (at least this is what I have found and would be happy to mention if there are more datasets like that), and I need its labels for evaluation of my classification. I was wondering if you could provide me the cell-by-peak file of this dataset and also its labels.

Best

znavidi commented 5 years ago

I would really appreciate if someone could help me with that.

Best

AliciaSchep commented 5 years ago

Hi @znavidi it looks like the dataset that you can download from GSE99172 has column names where the columns refer to the GEO accession for the previous data or are "NEW" if they are a K562 from that particular Series. To get the cell types of the other cells, you can use the GEO accession, for example using GEOquery:

library(GEOquery)
g <- getGEO("GSM1596955")
Meta(g) #Cell type included in "title" and also "characteristics_ch1"
AliciaSchep commented 5 years ago

Also https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE74310 provides a data set where the celltype is included in the column name if that is helpful

AliciaSchep commented 5 years ago

(And do download those files mentioned, scroll down the GEO page to "Supplementary file" at the bottom)

znavidi commented 5 years ago

Hi again,

Thanks dear Alicia. I will check them.

Best