bm2-lab / scMVP

MIT License
28 stars 11 forks source link

cell ground-true labels #4

Open yueyueliu opened 2 years ago

yueyueliu commented 2 years ago

Almost all data do not contain cell label files. Can you upload the cell ground-true labels of the dataset? thank you very much.

adamtongji commented 2 years ago

Cell labels for most datasets are included in files named as "xxx_wnn_output.txt".

Cell labels of 10x pbmc and share-seq skin datasets are missing in original files, and now upload to the folders as 10x_pbmc_annotation.txt and GSM4156597_skin_celltype.txt.

Ground-true labels of paired-seq cellline dataset are not provided in the NSMB paper. Four cell types in this dataset are easy to be distinguished and annotated use the scRNA profile only.

yueyueliu commented 2 years ago

Thank you very much for your answer. Your paper is worth studying.

poseidonchan commented 2 years ago

Hello,

I want to know that, why the cell number in the 10x_pbmc_annotation.txt is not equal to the normalized mtx matrix? Can you re-upload the dataset?

adamtongji commented 2 years ago

Hello,

I want to know that, why the cell number in the 10x_pbmc_annotation.txt is not equal to the normalized mtx matrix? Can you re-upload the dataset?

@poseidonchan Hi, the 10x_pbmc_annotation.txt is the annotation file from 10x website, and some of cells do not pass QC. You can use the "10x_pbmc_wnn_output.txt" as the cell annotation input, or filtering cells in "10x_pbmc_cell_barcode.txt" files.

poseidonchan commented 2 years ago

Hi,

Really thanks for your quick reply! Yes, after finding the intersected cells from two files I can use the annotation. But that's weird too, because the intersected cell number is 9543, but the cell number in barcodes is 10412 and the cell number in annotation file is 10032... Anyway, thanks again!

Have a good day!