10XGenomics / single-cell-3prime-paper

83 stars 35 forks source link

Cluster labels missing, and where to find annotations for 68K PBMC? #3

Open lynnyi opened 6 years ago

lynnyi commented 6 years ago

Hi,

I'm looking for annotations for the 68K PBMC dataset that corresponds to Fig 3 in Zheng et al.

I downloaded the kmeans clustering labels for the 68K PBMCs from this site (https://support.10xgenomics.com/single-cell-gene-expression/datasets/1.1.0/fresh_68k_pbmc_donor_a) , but the 10 cluster .csv file only had cluster labels for 40,000 cells, not the fully dataset.

Furthermore, I notice that the numbering scheme for these clusters does not match the numbering scheme for Figure 3 in Zheng et al., making it difficult to assign cell types to the cluster numbers. Could you also provide the annotations for all 68K cells?

Thank you, Lynn

gokceneraslan commented 6 years ago

It'd be really nice to save labels as tsv in this repo, I fully agree.

But for now, one option is to use the file in scanpy tutorial here, and the other option is to rerun this script in this repo, it'll generate cluster labels by correlating single cell expression with purified samples.

lynnyi commented 6 years ago

Thanks! I'm guessing the scanpy labels are the result of a de novo clustering analysis different from 10x though, since the scanpy labels don't seem to match the 10x labels though:

i.e. first 5 10x labels: Barcode,Cluster AAACATACACCCAA-1,2 AAACATACCCCTCA-1,3 AAACATACTAACCG-1,6 AAACATACTCTTCA-1,3 AAACATACTGTCTT-1,2

The 1st and 5th cell should be the same cluster, but first 5 scanpy labels: CD8+ Cytotoxic T CD8+/CD45RA+ Naive Cytotoxic CD4+/CD25 T Reg CD19+ B CD4+/CD25 T Reg

I'll take a look at the script and the solution that Magnus mentioned on twitter.

gokceneraslan commented 6 years ago

That's because the cell order is different:

image

Here is the full file with barcodes and labels as tsv: zheng17-cell-labels.txt

gokceneraslan commented 6 years ago

Barcode order in scanpy file follows the barcode order in http://cf.10xgenomics.com/samples/cell-exp/1.1.0/fresh_68k_pbmc_donor_a/fresh_68k_pbmc_donor_a_filtered_gene_bc_matrices.tar.gz file, fyi.

Labels are not from denovo clustering, it's based on correlation with 11 purified bulk samples, same as the R script.

Khalid-Usman commented 5 years ago

@gokceneraslan I have different barcode for Pbmc 2700, Can you please share file for it? Thanks

namratabhattacharya commented 3 years ago

@gokceneraslan Can you please help with cell type annotations of 3K PBMC? Kindly share the file for it.

zhiiiyang commented 3 years ago

@gokceneraslan, thank you for sharing the annotation for 68k PBMC. Is that ground truth or manually annotation from unsupervised clustering?