stemangiola / cellsig

GNU General Public License v3.0
1 stars 1 forks source link

Harmonise the datasets #64

Open stemangiola opened 2 years ago

stemangiola commented 2 years ago

we need to

1) integrate the raw counts to 2) cluster the data and 3) use the existing annotation to label each cluster.

Cell annotation

scSHC

XpelC commented 2 years ago
  1. integrate the raw counts to

    The integrated data group by dataset.

    Screen Shot 2022-08-24 at 3 12 19 PM

The integrated data group by sample.

image

The integrated data split by dataset

image

The integrated data group by cell types (before relabeling)

image
stemangiola commented 2 years ago

https://github.com/igrabski/sc-SHC

HERE WE MEET AGAIN WITH A REPORT OF ALL CLUSTERING TO CHOOSE THE BEST


cluster | dataset_1 | dataset_2   | ...
1           | 200       | 500                 | ...
12         | 0          | 1200                 | ...

cluster    | cxell_type 
1             | t_cell           
1             | t_CD8_memory           
1             | t_CD4_exhausted
2             | monocytes

for example

cluster 1 includes 5 datasets and includes, cd8_memory_t_cell, cd8_memory_T_cell, t_exhausted, cd8_resident. So you could assume this cluster cd8_memory_t_cell_exhausted

x_y_z, x_y > x_y_z because x_y is also present in another cluster, so the z flavour could distinguish this particular cluster.

XpelC commented 2 years ago
  • For each cluster, how many cell types
  1. umap result (cluster resolution 2, pca 50 dimensions) Screen Shot 2022-09-11 at 5 14 04 PM

line seurat_clusters cell_types

1 0 CAFs,Cancer,Cancer/Epithelial,Endothelial,Epithelial cell,Epitheli… 2 1 CD4 naive-cm,CD4 Trm,CD8 cytotoxic,CD8 Trm,CD8+ Tem,Doublets,Mast,… 3 2 Basal/intermediate,Cancer,CE,Endothelial,Endothelial 1,Endothelial… 4 3 Cancer,Cancer/Epithelial,CE,Doublets,Endothelial,Epithelial cell,E… 5 4 Cancer,Cancer/Epithelial,CE,Endothelial,Epithelial cell,Epithelial… 6 5 CD4 naive-cm,CD4 Trm,CD8 cytotoxic,CD8 Trm,CD8+ Tem,Mast,Monocytic… 7 6 Basal/intermediate,BE,Cancer,Cancer/Epithelial,CE,Epithelial cell,… 8 7 CAFs,Cancer,Cancer/Epithelial,CE,Doublets,Endothelial,Epithelial c… 9 8 Basal/intermediate,Cancer,Cancer/Epithelial,CE,Doublets,Endothelia… 10 9 CD4 naive-cm,CD4 Trm,CD8 Trm,CD8+ Tem,Doublets,Epithelial cell,Epi… 11 10 Basal/intermediate,Cancer,Cancer/Epithelial,CE,Epithelial cell,Epi… 12 11 CAFs,Cancer,Cancer/Epithelial,Epithelial cell,Epithelial_cells,Fib… 13 12 BE,Cancer,Cancer/Epithelial,CE,Epithelial cell,Epithelial_cells,LE… 14 13 Apidocytes,CAFs,Cancer,Cancer/Epithelial,Endothelial,Epithelial ce… 15 14 CAFs,Cancer,Cancer/Epithelial,CE,Doublets,Endothelial,Epithelial c… 16 15 Basal/intermediate,Cancer,Cancer/Epithelial,CE,Epithelial cell,Epi… 17 16 DC,Endothelial,Mac1,Macrophage,Mast,Mono,Monocyte,Monocytes,Monocy… 18 17 Basal/intermediate,Cancer,Cancer/Epithelial,Epithelial cell,Epithe… 19 18 Basal/intermediate,Cancer,Cancer/Epithelial,CE,Epithelial cell,Epi… 20 19 Basal/intermediate,Cancer,Cancer/Epithelial,CE,Epithelial cell,Epi… 21 20 Cancer,Cancer/Epithelial,Epithelial cell,Epithelial_cells,LE-KLK3,… 22 21 Endothelial,Endothelial cell,Endothelial_cells,Luminal,Unassigned 23 22 CD8 Trm,CD8+ Tem,Epithelial cell,Epithelial_cells,LE-KLK4,Luminal,… 24 23 DC,Endothelial,Luminal,Macrophage,Mast,Monocyte,Monocytes/Macropha… 25 24 Cancer,Cancer/Epithelial,CE,Epithelial cell,Epithelial_cells,LE-KL… 26 25 Cancer,Cancer/Epithelial,Cancer/Epithelial Cycling,Epithelial cell… 27 26 Apidocytes,CAFs,Cancer,Endothelial,Fibroblast,Myofibroblast,PVL ce… 28 27 B cell,Cancer,Cancer/Epithelial,Epithelial cell,Epithelial_cells,L… 29 28 Cancer,Cancer/Epithelial,Epithelial cell,Epithelial_cells,LE-KLK3,… 30 29 Endothelial,Endothelial cell,Endothelial_cells,Epithelial_cells,Fi… 31 30 B cell,B-cells,Luminal,Mast,Monocyte,Monocytes/Macrophages,T 32 31 CD8+ Tem,Luminal,NK_cell,T,T cell,T-cells 33 32 Cancer,Cancer/Epithelial,Epithelial cell,Epithelial_cells,LE-KLK3,… 34 33 Endothelial,Macrophage,Monocyte,Monocytes/Macrophages,Monocytic,My… 35 34 T cell,T-cells 36 35 Cancer,Cancer/Epithelial,Epithelial cell,Epithelial_cells,Fibrobla… 37 36 Monocytes,Monocytes/Macrophages,Myeloid,Pre-B_cell_CD34- 38 37 Epithelial cell,Luminal,Monocytes/Macrophages 39 38 Cancer/Epithelial,Luminal 40 39 pDCs
stemangiola commented 2 years ago

Much better!

would you be able to annotate this atlas? One cell type per cluster?

XpelC commented 2 years ago

Hello Stefano,

I’ll try it tomorrow, and test if the samples within the dataset need integration by plotting umap.

Best wishes, Xinpu

On Sep 11, 2022, at 6:13 PM, Stefano Mangiola @.**@.>> wrote:

Much better!

would you be able to annotate this atlas? One cell type per cluster?

— Reply to this email directly, view it on GitHubhttps://github.com/stemangiola/cellsig/issues/64#issuecomment-1242913765, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ALBA44YNA4KWZ4AU2UL6DILV5WIBJANCNFSM54BXBJNA. You are receiving this because you commented.Message ID: @.***>

stemangiola commented 2 years ago

I’ll try it tomorrow, and test if the samples within the dataset need integration by plotting umap.

Good. remember to do things in parallel. One of these days you can present to the consortium meeting if you wish. Presenting your work.

XpelC commented 2 years ago
  1. umap result (cluster resolution 2, pca 50 dimensions)

Try 100 pcs and plot the table.

stemangiola commented 2 years ago

@XpelC ,

this is a function to plot heatmap of the markers of just one cluster (with a balance sampling of cells to zoom onto that cluster)

gamma_delta_plot = 
    data_lymphoid %>% 
    get_markers_one_vs_all(
        curated_cell_type_pretty %in% c("gamma_delta"), 
        cluster_col = "curated_cell_type_pretty", 
        assay = "SCT", 
        disp.min = 0, disp.max = 4, size=3
    ) +
    viridis::scale_fill_viridis( option = "A")

source the attached file before

functions.R.zip

XpelC commented 2 years ago
  • For each cluster, how many cell types

(With breast cancer filtered)

image

A tibble: 45 × 2

seurat_clusters cell_types

1 0 cancer,epithelial,luminal,T 2 1 cancer,CD8_Tem,endothelial,epithelial,luminal,NK,perivascular,smoo… 3 2 B,cancer,epithelial,luminal,mast,T 4 3 cancer,CD8_Tem,endothelial,epithelial,fibroblast,luminal,mast,T 5 4 B,basal,basal_intermediate,cancer,CD4_naive,CD8_Tem,club_cell,endo… 6 5 cancer,CD8_Tem,CD8_Trm,club_cell,epithelial,luminal,mast,NK,periva… 7 6 basal,basal_intermediate,cancer,club_cell,endothelial,epithelial,f… 8 7 dendritic,endothelial,luminal,mac,macrophage,mast,monocyte,myeloid… 9 8 B,CD4_naive,CD8_Tem,CD8_Trm,epithelial,luminal,monocyte,NK,T 10 9 cancer,epithelial,luminal 11 10 cancer,epithelial,luminal 12 11 B,CD4_naive,CD4_Trm,CD8_cytotoxic,CD8_Tem,CD8_Trm,endothelial,mast… 13 12 cancer,CD8_Tem,endothelial,epithelial,luminal,mast,T 14 13 cancer,epithelial,luminal,T 15 14 basal,basal_intermediate,club_cell,endothelial,epithelial,fibrobla… 16 15 epithelial,luminal,perivascular,T 17 16 endothelial,fibroblast,mast 18 17 apidocytes,cancer_associated_fibroblast,endothelial,fibroblast,lum… 19 18 B,cancer,epithelial,luminal,plasma,T 20 19 cancer,epithelial,luminal 21 20 cancer,epithelial,luminal,NK,perivascular,sperm,T 22 21 cancer,epithelial,luminal 23 22 endothelial,mast 24 23 cancer,epithelial,luminal 25 24 epithelial,luminal,monocyte,sperm,T 26 25 apidocytes,cancer_associated_fibroblast,endothelial,fibroblast,myo… 27 26 endothelial,epithelial 28 27 epithelial,luminal,mast,perivascular,T 29 28 basal_intermediate,cancer,club_cell,epithelial,hillock,luminal,mas… 30 29 apidocytes,club_cell,endothelial,epithelial,fibroblast,monocyte,my… 31 30 cancer,epithelial,luminal 32 31 cancer,endothelial,epithelial,luminal,mast 33 32 epithelial,luminal,T 34 33 epithelial,luminal 35 34 luminal,mast 36 35 T 37 36 epithelial,mast 38 37 fibroblast 39 38 CD8_Tem,epithelial,monocyte,NK,T 40 39 epithelial,luminal 41 40 B 42 41 CD8_Trm,monocyte 43 42 mast 44 43 epithelial 45 44 luminal
stemangiola commented 2 years ago

Good, you can now use DoHeatmap from Seurat to annotate and check the clusters with their marker genes.

XpelC commented 2 years ago

Do you want me to divide them into two groups first? (Epithelial and immune?)

On Sep 19, 2022, at 9:43 AM, Stefano Mangiola @.**@.>> wrote:

Good, you can now use DoHeatmap from Seurat to annotate and check the clusters with their marker genes.

— Reply to this email directly, view it on GitHubhttps://github.com/stemangiola/cellsig/issues/64#issuecomment-1250412714, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ALBA444T3ZMOLZBQIUOCKZDV66SIZANCNFSM54BXBJNA. You are receiving this because you were mentioned.Message ID: @.***>

stemangiola commented 2 years ago

Hello @XpelC , we need to proceed fast.

1) Please address the macro clusters, the epithelial labelling is obviously including wrond microclusters. Please see and address this issue

https://github.com/stemangiola/cellsig/issues/69#issuecomment-1274126773

2) we mention for the heatmap:

I would expect points 1 and 2 done in one day. Please send me the fix of point one in the other github issue first to wait for my confirmation and then you will proceed with heatmapping.