stemangiola / cellsig

GNU General Public License v3.0
1 stars 1 forks source link

Defining the clusters #69

Open XpelC opened 1 year ago

XpelC commented 1 year ago

Clean the dataset

cell type name formatting

Check the dataset:

Cluster the cells separately [NOT NEEDED ANYMORE]

cell type decision

https://satijalab.org/seurat/articles/pbmc3k_tutorial.html

Sanity check

double check:

XpelC commented 1 year ago
  • Divide the dataset by epithelial cells and non-epithelial cells. Cluster the two groups separately and try to define the cell type of each cluster. The cells are over clustered with pc=50, resolution =2
    1. epithelial, luminal, cancer image
  1. other cells Screen Shot 2022-09-13 at 9 11 40 PM
stemangiola commented 1 year ago

Hello Xinpu,

You should divide and for each of the two subsets: predict variable genes , run PCA , integrate, define clusters

XpelC commented 1 year ago

Hello Stefano,

Ok, now I know why the figure looks quite similar as before, because I just run PCA and the downstream process. I’ll redo the predict variable gene process.

Best wishes, Xinpu

On Sep 13, 2022, at 9:39 PM, Stefano Mangiola @.**@.>> wrote:

Hello Xinpu,

You should divide and for each of the two subsets: predict variable genes , run PCA , integrate, define clusters

— Reply to this email directly, view it on GitHubhttps://github.com/stemangiola/cellsig/issues/69#issuecomment-1245287537, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ALBA4473MNKYIQH7Q6PFPOLV6BRV5ANCNFSM6AAAAAAQK6OQTQ. You are receiving this because you authored the thread.Message ID: @.***>

XpelC commented 1 year ago

Good afternoon Stefano,

Sorry for telling you this information, but I think my integrated data is ruined and all the variable features loss accidentally. Now I'm reintegrating the sample to recover the data and will catch up as soon as possible.

Best, Xinpu

On Sep 13, 2022, at 9:39 PM, Stefano Mangiola @.**@.>> wrote:

Hello Xinpu,

You should divide and for each of the two subsets: predict variable genes , run PCA , integrate, define clusters

— Reply to this email directly, view it on GitHubhttps://github.com/stemangiola/cellsig/issues/69#issuecomment-1245287537, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ALBA4473MNKYIQH7Q6PFPOLV6BRV5ANCNFSM6AAAAAAQK6OQTQ. You are receiving this because you authored the thread.Message ID: @.***>

XpelC commented 1 year ago
  • Divide the dataset by epithelial cells and non-epithelial cells. Cluster the two groups separately and try to define the cell type of each cluster.

epithelial cell

image

seurat_clusters cell_type n

1 0 luminal 1378 2 0 epithelial 1260 3 0 cancer 1033 4 0 myoepithelial 136 5 0 basal 82 6 0 club_cell 70 7 0 fibroblast 59 8 0 perivascular 56 9 0 mast 53 10 0 cancer_associated_fibroblast 43 11 1 cancer 1149 12 1 epithelial 1093 13 1 luminal 726 14 1 perivascular 269 15 1 macrophage 240 16 1 fibroblast 161 17 1 cancer_associated_fibroblast 147 18 1 club_cell 108 19 1 basal_intermediate 35 20 2 epithelial 1172 21 2 luminal 1119 22 2 cancer 598 23 2 T 192 24 2 cancer_associated_fibroblast 126 25 2 perivascular 120 26 2 myoepithelial 108 27 2 club_cell 98 28 2 basal 88 29 2 smooth_muscle 80 30 2 mast 59 31 2 fibroblast 31 32 3 fibroblast 835 33 3 epithelial 608 34 3 cancer 413 35 3 cancer_associated_fibroblast 397 36 3 perivascular 301 37 3 luminal 298 38 3 basal_intermediate 237 39 3 mast 214 40 3 smooth_muscle 103 41 3 club_cell 91 42 3 endothelial 67 43 3 myoepithelial 57 44 3 T 43 45 4 luminal 1639 46 4 epithelial 852 47 4 cancer 504 48 4 myoepithelial 332 49 4 T 72 50 4 mast 57 51 4 cancer_associated_fibroblast 53 52 4 basal_intermediate 40 53 4 macrophage 37 54 4 perivascular 36 55 4 fibroblast 33 56 5 luminal 1595 57 5 epithelial 1327 58 5 cancer 573 59 5 club_cell 34 60 5 myoepithelial 32 61 6 luminal 1491 62 6 epithelial 1100 63 6 cancer 379 64 6 club_cell 54 65 6 myoepithelial 37 66 7 luminal 1521 67 7 epithelial 866 68 7 cancer 529 69 8 epithelial 1132 70 8 luminal 821 71 8 cancer 189 72 8 perivascular 175 73 8 macrophage 137 74 8 fibroblast 123 75 8 club_cell 94 76 8 T 94 77 8 cancer_associated_fibroblast 65 78 8 doublets 53 79 8 basal_intermediate 42 80 9 epithelial 863 81 9 luminal 737 82 9 cancer 511 83 9 basal_intermediate 222 84 9 club_cell 92 85 9 T 76 86 9 mast 72 87 9 fibroblast 55 88 9 hillock 47 89 9 myoepithelial 41 90 10 luminal 1203 91 10 cancer 756 92 10 epithelial 506 93 10 endothelial 77 94 10 smooth_muscle 39 95 11 epithelial 940 96 11 luminal 786 97 11 cancer 664 98 11 myoepithelial 124 99 11 basal 39 100 12 luminal 965 101 12 epithelial 830 102 12 cancer 316 103 12 perivascular 73 104 12 T 48 105 12 fibroblast 41 106 12 cancer_associated_fibroblast 39 107 12 macrophage 39 108 12 basal_intermediate 34 109 12 club_cell 33 110 13 luminal 1152 111 13 epithelial 670 112 13 cancer 475 113 14 luminal 1218 114 14 epithelial 443 115 14 cancer 324 116 14 club_cell 47 117 14 macrophage 47 118 14 perivascular 47 119 14 basal_intermediate 39 120 14 T 32 121 14 fibroblast 31 122 15 epithelial 876 123 15 luminal 714 124 15 cancer 583 125 15 myoepithelial 44 126 16 luminal 767 127 16 epithelial 440 128 16 cancer 221 129 16 cancer_associated_fibroblast 158 130 16 fibroblast 145 131 16 perivascular 96 132 16 macrophage 90 133 16 basal_intermediate 51 134 16 club_cell 49 135 16 T 38 136 17 epithelial 790 137 17 cancer 673 138 17 luminal 311 139 17 T_cycling 154 140 17 sperm 58 141 17 macrophage_cycling 42 142 18 luminal 960 143 18 epithelial 704 144 18 cancer 390 145 19 epithelial 774 146 19 cancer 470 147 19 luminal 319 148 19 sperm 176 149 19 club_cell 34 150 20 cancer 416 151 20 epithelial 360 152 20 luminal 331 153 20 smooth_muscle 235 154 20 myoepithelial 194 155 20 doublets 102 156 20 cancer_associated_fibroblast 41 157 21 luminal 954 158 21 epithelial 366 159 21 cancer 311 160 21 fibroblast 47 161 22 epithelial 682 162 22 luminal 495 163 22 cancer 405 164 23 luminal 701 165 23 epithelial 338 166 23 cancer 319 167 23 myoepithelial 47 168 23 fibroblast 45 169 24 luminal 859 170 24 epithelial 314 171 24 cancer 204 172 25 luminal 598 173 25 epithelial 509 174 25 club_cell 70 175 25 cancer 54 176 26 luminal 577 177 26 epithelial 277 178 26 cancer 106 179 27 epithelial 403 180 27 cancer 258 181 27 luminal 143 182 27 club_cell 48 183 27 endothelial 41 184 28 epithelial 307 185 28 luminal 215 186 28 cancer 128 187 28 endothelial 49 188 28 perivascular 48 189 28 fibroblast 34 190 29 luminal 372 191 29 epithelial 298 192 29 cancer 90 193 30 cancer 229 194 30 epithelial 178 195 30 luminal 124 196 30 T_cycling 53 197 31 B 131 198 31 epithelial 68 199 31 plasmablast 65 200 31 luminal 37 201 31 plasma 35 202 31 cancer 32 203 32 cancer 343 204 33 cancer 142 205 33 luminal 114 206 33 epithelial 39 207 34 cancer 120 208 35 luminal 32
stemangiola commented 1 year ago

use


    RunUMAP( dims = 1:30,  spread    = 0.5,min.dist  = 0.01, n.meighbors = 10) 

`
XpelC commented 1 year ago

Ok, I’m currently filtering the immune cell dataset to fix the integration error (two samples containing too small number of cells).

On Sep 14, 2022, at 11:09 PM, Stefano Mangiola @.**@.>> wrote:

use

    RunUMAP( dims = 1:30,  spread    = 0.5,min.dist  = 0.01, n.meighbors = 10)

`

— Reply to this email directly, view it on GitHubhttps://github.com/stemangiola/cellsig/issues/69#issuecomment-1246741987, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ALBA445ZLISP4ISN4KN5NFLV6HFBTANCNFSM6AAAAAAQK6OQTQ. You are receiving this because you authored the thread.Message ID: @.***>

XpelC commented 1 year ago

use RunUMAP( dims = 1:30, spread = 0.5,min.dist = 0.01, n.meighbors = 10) It doesn't make much difference. epithelial

image

seurat_clusters cell_types

1 0 basal_intermediate,cancer,cancer_associated_fibroblast,club_cell,d… 2 1 basal,basal_intermediate,cancer,cancer_associated_fibroblast,club_… 3 2 basal_intermediate,cancer,cancer_associated_fibroblast,club_cell,e… 4 3 basal_intermediate,cancer,cancer_associated_fibroblast,club_cell,d… 5 4 basal,cancer,cancer_associated_fibroblast,club_cell,epithelial,lum… 6 5 cancer,cancer_associated_fibroblast,epithelial,fibroblast,luminal,… 7 6 basal_intermediate,cancer,cancer_associated_fibroblast,club_cell,e… 8 7 basal_intermediate,cancer,cancer_associated_fibroblast,club_cell,e… 9 8 cancer,epithelial,luminal,myoepithelial 10 9 cancer,endothelial,epithelial,luminal 11 10 basal,basal_intermediate,cancer,club_cell,epithelial,fibroblast,lu… 12 11 cancer,epithelial,luminal,myoepithelial 13 12 basal_intermediate,cancer,club_cell,epithelial,luminal,macrophage,… 14 13 cancer,epithelial,luminal 15 14 cancer,epithelial,luminal 16 15 basal,cancer,club_cell,epithelial,luminal,myoepithelial,perivascul… 17 16 cancer,epithelial,luminal 18 17 cancer,epithelial,luminal,macrophage_cycling,sperm,T_cycling 19 18 cancer,epithelial,luminal,perivascular 20 19 cancer,epithelial,luminal 21 20 cancer,epithelial,luminal,sperm 22 21 cancer,epithelial,fibroblast,luminal,perivascular 23 22 cancer,club_cell,epithelial,luminal,macrophage,perivascular 24 23 cancer,epithelial,luminal,myoepithelial 25 24 cancer,club_cell,endothelial,epithelial,fibroblast,luminal,perivas… 26 25 cancer,club_cell,endothelial,epithelial,luminal 27 26 cancer,epithelial,luminal 28 27 cancer,epithelial,luminal 29 28 cancer,epithelial,luminal 30 29 cancer,club_cell,epithelial,luminal 31 30 B,cancer,epithelial,luminal,plasma,plasmablast 32 31 cancer,epithelial,luminal,T_cycling 33 32 cancer,epithelial,luminal 34 33 cancer 35 34 cancer,epithelial,luminal 36 35 epithelial,luminal 37 36 cancer 38 37 luminal

other cell types The immune cluster is more clear this time

image

seurat_clusters cell_types

1 0 B,CD4_Trm,CD8_cytotoxic,CD8_Tem,CD8_Trm,doublets,endothelial,epith… 2 1 B,CD4_naive,CD4_Trm,CD8_cytotoxic,CD8_Tem,CD8_Trm,endothelial,epit… 3 2 apidocytes,cancer,cancer_associated_fibroblast,endothelial,epithel… 4 3 B,CD4_naive,CD8_Tem,CD8_Trm,endothelial,epithelial,luminal,mast,mo… 5 4 cancer,endothelial,epithelial,mast,monocyte 6 5 cancer,CD8_Tem,endothelial,epithelial,macrophage,mast,monocyte,NK,… 7 6 B,CD4_naive,CD8_Tem,endothelial,epithelial,luminal,mast,monocyte,T 8 7 apidocytes,cancer,cancer_associated_fibroblast,endothelial,fibrobl… 9 8 B,CD8_Tem,endothelial,epithelial,fibroblast,luminal,macrophage,mas… 10 9 basal_intermediate,cancer,club_cell,endothelial,epithelial,fibrobl… 11 10 apidocytes,cancer,cancer_associated_fibroblast,doublets,endothelia… 12 11 endothelial,epithelial,luminal,macrophage,mast,monocyte,sperm,T 13 12 endothelial,mac,macrophage,mast,monocyte,myeloid,T 14 13 endothelial,fibroblast 15 14 B,dendritic,macrophage,mast,monocyte,myeloid,T 16 15 cancer,cancer_associated_fibroblast,endothelial,epithelial,luminal… 17 16 CD8_Tem,NK,NK_CD16_pos,T 18 17 cancer,cancer_associated_fibroblast,endothelial,epithelial,fibrobl… 19 18 cancer,doublets,endothelial,unassigned 20 19 epithelial,mast 21 20 macrophage,monocyte,myeloid 22 21 T 23 22 B 24 23 macrophage,monocyte,myeloid,pre_B_cell 25 24 T 26 25 endothelial 27 26 cancer 28 27 dendritic
stemangiola commented 1 year ago

Ok try to define cluster identity by Friday, so we can meet and discuss. Prob for epithelial we are overclustering

stemangiola commented 1 year ago

FeaturePlot: Use the original umap (split by cell marker, and sample).

image

Barplot

image
XpelC commented 1 year ago
  • Check the scRNA sequencing method (10X, SMARTseq2)
Screen Shot 2022-09-18 at 3 16 22 PM
XpelC commented 1 year ago
  • To see which sample is from cancer patients, which is not (print a table)

Tumor: sample

1 SLX-15732SIGAC4HTVNWBBXXs_6 2 SLX-15736SIGAA9HTHM2BBXXs_4 3 SLX-15736SIGAA9HTHM2BBXXs_5 4 SLX-15736SIGAD8HTHM2BBXXs_4 5 SLX-15736SIGAD8HTHM2BBXXs_5 6 SLX-15929SIGAC4HVLYKBBXXs_8 7 SLX-15929SIGAG11HVLYKBBXXs_8 8 SLX-16140SIGAB1HVMFKBBXXs_3 9 SLX-16142SIGAG6HVMFKBBXXs_4 10 SLX-16147SIGAE7HVMFKBBXXs_7 11 SLX-16148SIGAG7HVMFKBBXXs_8 12 SLX-16362SIGAA3HWFTVBBXXs_4 13 JD1800159SL 14 JD1800162SL 15 JD1800174SL 16 JD1800172SL 17 JD1800173SL 18 JD1800171SL 19 JD1800175SL 20 JD1800177SL 21 JD1800176SL 22 JD1800154SL 23 JD1800155SL 24 JD1800156SL 25 JD1800153SL 26 PR5249_T 27 PR5251_T 28 PR5254_T 29 PR5261_T 30 PC-P1 31 GSM4089151 32 GSM4089152 33 GSM4089153 34 GSM4089154 35 GSM4711414 36 GSM4711415 Normal: sample 1 SLX-15732SIGAD4HTVNWBBXXs_6 2 SLX-15736SIGAB9HTHM2BBXXs_4 3 SLX-15736SIGAB9HTHM2BBXXs_5 4 SLX-15736SIGAE8HTHM2BBXXs_4 5 SLX-15736SIGAE8HTHM2BBXXs_5 6 SLX-15929SIGAD4HVLYKBBXXs_8 7 SLX-15929SIGAH11HVLYKBBXXs_8 8 SLX-16140SIGAC1HVMFKBBXXs_3 9 SLX-16142SIGAH6HVMFKBBXXs_4 10 SLX-16147SIGAF7HVMFKBBXXs_7 11 SLX-16148SIGAH7HVMFKBBXXs_8 12 SLX-16362SIGAB3HWFTVBBXXs_4 13 PR5249_N 14 PR5251_N 15 PR5254_N 16 PR5261_N
stemangiola commented 1 year ago
  • Check the scRNA sequencing method (10X, SMARTseq2)
Screen Shot 2022-09-18 at 3 16 22 PM

OK let's start by 9nly keeping the 10x

XpelC commented 1 year ago

Actually, do you want to wait to see the integrated result with the breast cancer filtered? Then to decide if we should take out the seq-Well sequencing method? Since in my memory (also I checked with the cell name), the most problematic dataset which gives weird cell position is not GSE176031.

On Sep 18, 2022, at 4:38 PM, Stefano Mangiola @.**@.>> wrote:

[Screen Shot 2022-09-18 at 3 16 22 PM]https://user-images.githubusercontent.com/46272115/190886892-0eedfb37-06fb-43d0-9c43-e8718b33e6a4.png

OK let's start by 9nly keeping the 10x

— Reply to this email directly, view it on GitHubhttps://github.com/stemangiola/cellsig/issues/69#issuecomment-1250203630, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ALBA44YIP4K3SUV6LQWABBLV622FVANCNFSM6AAAAAAQK6OQTQ. You are receiving this because you authored the thread.Message ID: @.***>

stemangiola commented 1 year ago

@XpelC can we double check that any of the 5 studies did not sort certain cell types before sequencing (for example trying to sort immune cells only).

XpelC commented 1 year ago

@XpelC can we double check that any of the 5 studies did not sort certain cell types before sequencing (for example trying to sort immune cells only).

Please use this new version of umap result (I change the reference during the integration)

image Screen Shot 2022-09-21 at 10 34 21 AM

cell type for 5 datasets

sample.combined%>%count(dataset,cell_type)%>%print(n=Inf) tidyseurat says: A data frame is returned for independent data analysis.

A tibble: 58 × 3

dataset cell_type n

1 EGAS00001005115 B 1 2 EGAS00001005115 cancer 4002 3 EGAS00001005115 cancer_associated_fibroblast 422 4 EGAS00001005115 endothelial 1196 5 EGAS00001005115 macrophage 189 6 EGAS00001005115 mast 89 7 EGAS00001005115 NK 69 8 EGAS00001005115 perivascular 811 9 EGAS00001005115 T 676 10 EGAS00001005115 unassigned 120 11 EGAS00001005787 B 133 12 EGAS00001005787 basal 429 13 EGAS00001005787 CD4_naive 256 14 EGAS00001005787 CD4_Trm 169 15 EGAS00001005787 CD8_cytotoxic 154 16 EGAS00001005787 CD8_Trm 484 17 EGAS00001005787 club_cell 1259 18 EGAS00001005787 dendritic 88 19 EGAS00001005787 endothelial 439 20 EGAS00001005787 fibroblast 90 21 EGAS00001005787 hillock 172 22 EGAS00001005787 luminal 6008 23 EGAS00001005787 mac 128 24 EGAS00001005787 mac_cycling 16 25 EGAS00001005787 mac_mt 21 26 EGAS00001005787 mast 37 27 EGAS00001005787 monocyte 61 28 EGAS00001005787 NK 72 29 EGAS00001005787 NK_CD16_neg 51 30 EGAS00001005787 NK_CD16_pos 73 31 EGAS00001005787 sperm 1002 32 EGAS00001005787 T 1751 33 EGAS00001005787 Treg 114 34 GSE137829 B 539 35 GSE137829 endothelial 653 36 GSE137829 epithelial 11732 37 GSE137829 fibroblast 1565 38 GSE137829 mast 945 39 GSE137829 myeloid 873 40 GSE137829 myofibroblast 450 41 GSE137829 T 2293 42 GSE141445 basal_intermediate 1015 43 GSE141445 endothelial 3833 44 GSE141445 fibroblast 1051 45 GSE141445 luminal 22139 46 GSE141445 mast 1840 47 GSE141445 monocyte 1260 48 GSE141445 T 3933 49 GSE176031 apidocytes 526 50 GSE176031 CD8_Tem 2019 51 GSE176031 endothelial 1194 52 GSE176031 epithelial 12023 53 GSE176031 macrophage 282 54 GSE176031 monocyte 2101 55 GSE176031 NK 336 56 GSE176031 plasma 125 57 GSE176031 pre_B_cell 113 58 GSE176031 smooth_muscle 620
stemangiola commented 1 year ago

I change the reference during the integration

Well done. You can let Seurart choose the reference even, and leave it run overnight.

Please use this new version of umap result

Is this UMAP including epithelial + immune? Do you think with this new version we can annotate decently? And call it done?

XpelC commented 1 year ago

other_ cell

image

epithelial

image
stemangiola commented 1 year ago

Great

other_ cell

Are you able to annotate Immune clusters?

epithelial

Would you be able to color by

XpelC commented 1 year ago
  • FeaturePlot: Use the original umap (split by cell marker, and sample).
  • Use features and the markers:

The other features are not found in the slot of data

image

Use the code: FeaturePlot(sample.combined, features = c("CD14", "FCGR3A", "CD79A", "CD3G", "EPCAM", "VIM", "CD31", "CD68"), min.cutoff = 'q9')

stemangiola commented 1 year ago

The other features are not found in the slot of data

Always use SCT assay for colouring cells, not integrated.

If still not found use RNA assay

XpelC commented 1 year ago

Always use SCT assay for colouring cells, not integrated.

By using the 'SCT' assay, we found almost all features except CD31. Even use 'RNA' assay, we could not found CD31 for endothelial.

image

The position of these colored features are checked, which matched our label

stemangiola commented 1 year ago

By using the 'SCT' assay, we found almost all features except CD31. Even use 'RNA' assay, we could not found CD31 for endothelial.

CD31 might actually have a different gene name, please double check google.

stemangiola commented 1 year ago

Great, I think all makes sense.

If you find endothelial, CD31 you are ready to complete the annotation!

image
XpelC commented 1 year ago

If you find endothelial, CD31 you are ready to complete the annotation!

Actually I used PLVAP to be a substitute of CD31 as a marker of endothelial. Also, the position is correct, do you think it's ok?

image
stemangiola commented 1 year ago

Also, the position is correct, do you think it's ok?

We need to distinguish between fibroblasts and endothelial.

Get a better marker for fibroblast.

Maybe

image
XpelC commented 1 year ago

Get a better marker for fibroblast.

These are all markers for fibroblast (VIM is what we used before)

image

According to our cell type result, cluster 32, 14, 18 are labeled as fibroblast. So maybe image

stemangiola commented 1 year ago

Congrats, I think you got it. Please produce the other images for the to do list, and let's create a Seurat harmonised file, with cluster annotation.

XpelC commented 1 year ago

Ok, since I’m still on the way heading to my apartment. See you tomorrow.

Best wishes, Xinpu

On Sep 26, 2022, at 6:15 PM, Stefano Mangiola @.**@.>> wrote:

Congrats, I think you got it. Please produce the other images for the to do list, and let's create a Seurat harmonised file, with cluster annotation.

— Reply to this email directly, view it on GitHubhttps://github.com/stemangiola/cellsig/issues/69#issuecomment-1257663496, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ALBA44YSJJMZZUJ2KOH5YQTWAFLRNANCNFSM6AAAAAAQK6OQTQ. You are receiving this because you were mentioned.Message ID: @.***>

stemangiola commented 1 year ago

one_vs_all source @XpelC functions.R.zip

XpelC commented 1 year ago

one_vs_all source @XpelC functions.R.zip

image
stemangiola commented 1 year ago

one_vs_all source @XpelC functions.R.zip

image

download file and unzip it, it is a R script

XpelC commented 1 year ago

function: ComputeMarkers

image
stemangiola commented 1 year ago

Forget about this function, just use standard function from Seurat

XpelC commented 1 year ago
  • Check the cell marker of clusters with the following command

top10%>%print(n=Inf)

Please @XpelC use the below formatting for pasting tables and code

A tibble: 400 × 7
Groups:   cluster [40]

        p_val avg_log2FC pct.1 pct.2 p_val_adj cluster gene     
        <dbl>      <dbl> <dbl> <dbl>     <dbl> <fct>   <chr>    
  1 0              5.44  0.925 0.19  0         0       TUBA4A   
  2 0              5.36  0.924 0.293 0         0       FAM177A1 
  3 0              5.20  0.804 0.222 0         0       CNOT6L   
  4 8.30e-264      5.49  0.479 0.229 1.66e-260 0       GCHFR    
  5 1.21e-254      5.92  0.604 0.371 2.41e-251 0       H3F3B    
  6 6.18e-216      6.28  0.466 0.214 1.24e-212 0       EMB      
  7 1.42e-162      4.98  0.424 0.237 2.85e-159 0       CCSER2   
  8 6.70e-143      5.47  0.267 0.224 1.34e-139 0       GLIPR2   
  9 1.93e- 15      7.44  0.122 0.29  3.85e- 12 0       CTDNEP1  
 10 5.54e- 12      5.78  0.116 0.28  1.11e-  8 0       AZI2     
 11 0              6.78  0.875 0.383 0         1       C19orf48 
 12 0              6.48  0.723 0.377 0         1       SMIM4    
 13 0              5.56  0.889 0.326 0         1       EIF4EBP1 
 14 0              5.37  0.82  0.369 0         1       NME1     
 15 0              4.71  0.829 0.386 0         1       BNIP3    
 16 4.10e-251      4.57  0.186 0.294 8.20e-248 1       PLEKHB2  
 17 2.31e-137      5.04  0.483 0.301 4.62e-134 1       SDR39U1  
 18 2.09e-130      4.38  0.483 0.275 4.18e-127 1       NOB1     
 19 1.37e- 75      8.85  0.454 0.329 2.74e- 72 1       BAX      
 20 1.95e-  3      7.22  0.28  0.24  1   e+  0 1       ORC3     
 21 1.07e-266      6.19  0.626 0.337 2.14e-263 2       HOXA9    
 22 2.91e-249      3.59  0.577 0.338 5.82e-246 2       GTF3C5   
 23 2.94e- 65      5.16  0.421 0.295 5.89e- 62 2       ATPAF1   
 24 1.24e- 60      4.10  0.412 0.299 2.49e- 57 2       C1orf56  
 25 1.83e- 42      4.80  0.342 0.377 3.67e- 39 2       ERP29    
 26 3.10e- 31      9.60  0.451 0.405 6.21e- 28 2       HNRNPH1  
 27 3.32e- 17      3.83  0.362 0.287 6.64e- 14 2       IQCK     
 28 6.58e-  9      3.76  0.295 0.291 1.32e-  5 2       MED31    
 29 1.24e-  7      3.68  0.272 0.257 2.47e-  4 2       UBE2F    
 30 4.75e-  7      4.49  0.26  0.237 9.50e-  4 2       NUP54    
 31 1.84e- 49      7.71  0.172 0.313 3.69e- 46 3       SDCBP    
 32 3.11e- 40      5.92  0.366 0.261 6.22e- 37 3       JKAMP    
 33 3.22e- 36      7.22  0.304 0.271 6.44e- 33 3       PLEKHA3  
 34 1.36e- 30      5.87  0.354 0.234 2.72e- 27 3       RNF185   
 35 8.41e- 19      6.12  0.303 0.215 1.68e- 15 3       L3MBTL2  
 36 2.45e- 11      6.99  0.352 0.25  4.90e-  8 3       ARHGAP12 
 37 8.19e- 11      7.09  0.319 0.321 1.64e-  7 3       WDR6     
 38 1.21e- 10      7.54  0.307 0.242 2.42e-  7 3       COQ9     
 39 1.20e-  9      6.20  0.287 0.377 2.39e-  6 3       C8orf33  
 40 3.15e-  6      6.33  0.347 0.263 6.30e-  3 3       SHROOM1  
 41 0             10.7   0.692 0.35  0         4       TMEM123  
 42 0              8.22  0.886 0.209 0         4       AGR2     
 43 0              7.64  0.768 0.287 0         4       AZGP1    
 44 0              7.18  0.784 0.325 0         4       NEDD4L   
 45 0              7.07  0.843 0.291 0         4       LIMCH1   
 46 0              6.58  0.964 0.283 0         4       TSPAN1   
 47 0              6.49  0.977 0.391 0         4       CLDN4    
 48 3.41e-205      7.76  0.576 0.229 6.83e-202 4       RASEF    
 49 3.53e-  4      7.52  0.265 0.257 7.06e-  1 4       HEXB     
 50 2.58e-  3      8.11  0.338 0.271 1   e+  0 4       CDK10    
 51 4.77e- 58      6.37  0.262 0.31  9.55e- 55 5       ATAD3A   
 52 2.30e- 50      4.65  0.199 0.263 4.60e- 47 5       SNX1     
 53 4.06e- 44      3.34  0.45  0.297 8.11e- 41 5       MTFR1    
 54 7.04e- 17      3.25  0.438 0.328 1.41e- 13 5       MRPL42   
 55 7.21e- 10      3.35  0.322 0.288 1.44e-  6 5       C12orf10 
 56 1.02e-  7      3.69  0.404 0.455 2.05e-  4 5       NDUFB1   
 57 6.55e-  7      3.55  0.233 0.258 1.31e-  3 5       RPAP3    
 58 8.23e-  6      6.68  0.265 0.258 1.65e-  2 5       CCDC43   
 59 3.10e-  5      3.37  0.376 0.338 6.20e-  2 5       GEMIN7   
 60 5.48e-  4      3.68  0.275 0.33  1   e+  0 5       RBM42    
 61 1.27e-224      7.86  0.568 0.282 2.55e-221 6       EMC9     
 62 4.48e-122      9.24  0.571 0.374 8.96e-119 6       TRIM28   
 63 5.27e-114      6.21  0.549 0.386 1.05e-110 6       ARPC5L   
 64 1.18e-110      6.28  0.521 0.297 2.37e-107 6       SIX1     
 65 8.77e- 97      6.69  0.674 0.488 1.75e- 93 6       MIF      
 66 1.47e- 89      7.19  0.419 0.27  2.94e- 86 6       GRPEL1   
 67 3.47e- 71      8.16  0.607 0.478 6.95e- 68 6       ZFAS1    
 68 4.70e- 17      5.36  0.419 0.33  9.39e- 14 6       ARL6IP1  
 69 2.23e-  6      5.42  0.289 0.254 4.46e-  3 6       FAM114A1 
 70 7.02e-  3      6.84  0.305 0.273 1   e+  0 6       RCAN3    
 71 1.71e-253      1.80  0.254 0.223 3.42e-250 7       GLIPR1   
 72 2.75e-129      6.11  0.197 0.268 5.49e-126 7       ARMC6    
 73 4.79e- 88      2.90  0.525 0.352 9.58e- 85 7       PPP3CA   
 74 6.31e- 75      7.82  0.33  0.285 1.26e- 71 7       JAK1     
 75 7.72e- 28      1.96  0.425 0.298 1.54e- 24 7       MON1B    
 76 3.62e- 26      5.69  0.449 0.328 7.24e- 23 7       TOB1     
 77 3.55e- 20      2.22  0.476 0.403 7.10e- 17 7       VMP1     
 78 2.46e- 14      2.94  0.413 0.328 4.93e- 11 7       MCCC2    
 79 1.82e-  9      8.17  0.346 0.292 3.63e-  6 7       LETM1    
 80 1.74e-  7      3.21  0.179 0.27  3.48e-  4 7       TXNDC9   
 81 0              8.76  0.818 0.359 0         8       FDPS     
 82 0              8.50  0.957 0.386 0         8       PAFAH1B3 
 83 0              8.18  0.844 0.32  0         8       ACOT13   
 84 0              7.93  0.97  0.46  0         8       PDCD5    
 85 0              7.37  0.935 0.418 0         8       SLC25A4  
 86 4.39e-293      6.97  0.544 0.256 8.79e-290 8       PLRG1    
 87 5.87e-196      9.36  0.468 0.262 1.17e-192 8       INTS10   
 88 5.75e- 96      6.82  0.351 0.239 1.15e- 92 8       ZCCHC10  
 89 5.18e-  9      8.90  0.262 0.248 1.04e-  5 8       PPT1     
 90 5.67e-  3      7.62  0.366 0.274 1   e+  0 8       GSTA4    
 91 0              7.68  0.813 0.342 0         9       SPATS2L  
 92 0              5.67  0.802 0.348 0         9       IFT57    
 93 0              3.18  0.865 0.343 0         9       SNHG10   
 94 2.30e-122      3.38  0.506 0.289 4.61e-119 9       PAFAH1B2 
 95 1.73e- 60      2.90  0.264 0.33  3.47e- 57 9       DDX21    
 96 9.99e- 58      2.87  0.443 0.294 2.00e- 54 9       DHX9     
 97 5.75e- 27      3.17  0.269 0.303 1.15e- 23 9       PRKAR1A  
 98 3.12e- 16      6.04  0.292 0.259 6.23e- 13 9       SF3A3    
 99 3.12e- 13      3.37  0.33  0.288 6.24e- 10 9       PRMT2    
100 1.72e- 10      3.09  0.29  0.283 3.45e-  7 9       SH3GLB1  
101 0              8.02  0.683 0.261 0         10      ASAH1    
102 0              7.46  0.881 0.145 0         10      PYCARD   
103 0              7.41  0.122 0.52  0         10      MARCKSL1 
104 0              7.29  0.951 0.22  0         10      CYBA     
105 8.87e-188      8.32  0.106 0.256 1.77e-184 10      ATF1     
106 8.70e- 75      7.47  0.429 0.246 1.74e- 71 10      C12orf45 
107 4.38e- 67      7.35  0.54  0.438 8.75e- 64 10      TXN      
108 1.42e- 52      9.83  0.453 0.349 2.84e- 49 10      EZR      
109 1.38e-  8     10.0   0.429 0.37  2.76e-  5 10      PPA1     
110 2.71e-  3      9.14  0.303 0.248 1   e+  0 10      ZFP91    
111 0             10.7   0.783 0.303 0         11      HIST1H2BD
112 0              6.95  0.795 0.383 0         11      HIST1H2AC
113 0              5.94  0.88  0.465 0         11      HMGB1    
114 3.59e-157      6.19  0.557 0.3   7.19e-154 11      ITGAE    
115 4.24e- 93      6.05  0.496 0.34  8.48e- 90 11      C1orf122 
116 3.91e- 73      6.02  0.501 0.328 7.82e- 70 11      MAZ      
117 4.64e- 50      5.55  0.476 0.32  9.27e- 47 11      PIN4     
118 3.61e- 33      9.08  0.259 0.28  7.21e- 30 11      ATP6V1D  
119 1.80e- 24      8.74  0.196 0.263 3.59e- 21 11      LLPH     
120 2.01e- 15      8.05  0.331 0.309 4.02e- 12 11      SRPRB    
121 0             13.2   0.825 0.277 0         12      CORO1B   
122 0             11.2   0.741 0.294 0         12      ICA1     
123 0              8.93  0.756 0.207 0         12      TBC1D4   
124 3.09e-128      6.83  0.275 0.249 6.17e-125 12      ZC2HC1A  
125 3.83e-122      8.08  0.484 0.359 7.67e-119 12      SYNGR2   
126 2.04e- 74      7.02  0.282 0.268 4.08e- 71 12      APIP     
127 2.01e- 59      6.93  0.394 0.299 4.02e- 56 12      PMVK     
128 6.24e- 25      6.80  0.308 0.3   1.25e- 21 12      OGT      
129 1.13e- 23      8.16  0.331 0.283 2.26e- 20 12      TLK1     
130 1.51e-  7      7.01  0.218 0.256 3.03e-  4 12      C9orf85  
131 0             15.2   0.942 0.186 0         13      GATA2    
132 0             15.0   0.951 0.244 0         13      NSMCE1   
133 0             11.1   0.688 0.285 0         13      MLPH     
134 0              8.71  0.718 0.24  0         13      FDX1     
135 0              8.55  0.783 0.195 0         13      RAB27B   
136 0              8.39  0.799 0.256 0         13      NCOA4    
137 0              8.04  0.66  0.186 0         13      DTNBP1   
138 0              7.76  0.941 0.223 0         13      ID2      
139 7.28e- 63      7.55  0.307 0.242 1.46e- 59 13      HINT3    
140 3.94e-  3      9.51  0.205 0.254 1   e+  0 13      SDCCAG8  
141 0             10.2   0.964 0.222 0         14      SPON2    
142 0              8.23  0.988 0.218 0         14      TIMP1    
143 0              7.57  0.889 0.243 0         14      RAMP1    
144 0              7.30  0.774 0.26  0         14      ALDH1A3  
145 3.07e-208      9.71  0.648 0.35  6.15e-205 14      TCEAL4   
146 2.41e-127      9.27  0.442 0.233 4.82e-124 14      COQ10B   
147 1.95e- 73      7.75  0.525 0.304 3.89e- 70 14      CTSF     
148 5.55e- 65      7.13  0.393 0.257 1.11e- 61 14      DDAH1    
149 6.25e- 40     10.0   0.479 0.297 1.25e- 36 14      CHD9     
150 3.28e-  5      7.01  0.307 0.349 6.56e-  2 14      PNRC1    
151 0             10.5   0.876 0.328 0         15      HOMER2   
152 0              8.10  0.783 0.331 0         15      BIK      
153 0              8.06  0.991 0.42  0         15      PFN2     
154 0              7.46  0.924 0.296 0         15      FAM3B    
155 0              7.04  0.736 0.371 0         15      CD47     
156 0              6.88  0.798 0.251 0         15      PRAC2    
157 0              6.58  0.935 0.411 0         15      LY6E     
158 0              6.44  0.756 0.389 0         15      ADI1     
159 0              6.01  0.994 0.481 0         15      MDK      
160 2.80e- 98      6.23  0.399 0.226 5.61e- 95 15      RAE1     
161 0             10.6   0.884 0.477 0         16      ACTG1    
162 0              8.29  0.822 0.357 0         16      REXO2    
163 0              6.18  0.852 0.333 0         16      HMGA1    
164 3.28e-243      4.97  0.617 0.245 6.56e-240 16      LPAR6    
165 1.37e-138      4.59  0.548 0.253 2.74e-135 16      PBX1     
166 1.42e- 89      4.37  0.509 0.325 2.83e- 86 16      C1orf21  
167 6.58e- 40      4.50  0.31  0.24  1.32e- 36 16      TRAPPC2  
168 1.87e- 10      4.89  0.326 0.291 3.74e-  7 16      OFD1     
169 2.41e- 10      4.47  0.366 0.463 4.83e-  7 16      RCN2     
170 2.83e-  5      5.99  0.173 0.301 5.66e-  2 16      NAAA     
171 0              6.90  0.835 0.226 0         17      VAT1     
172 0              6.81  0.958 0.282 0         17      DUSP23   
173 0              6.43  0.996 0.34  0         17      NPDC1    
174 0              5.79  0.997 0.316 0         17      FKBP1A   
175 3.68e-212      4.28  0.481 0.251 7.36e-209 17      WIPI1    
176 1.28e- 82      5.21  0.589 0.222 2.56e- 79 17      EHD4     
177 1.39e- 60      6.18  0.53  0.386 2.79e- 57 17      CPE      
178 3.85e- 18      8.15  0.317 0.251 7.69e- 15 17      QRICH1   
179 5.78e- 18      6.78  0.253 0.257 1.16e- 14 17      CAMK1    
180 7.31e-  5      4.78  0.411 0.364 1.46e-  1 17      EIF4G1   
181 0              8.58  0.785 0.232 0         18      FAM13C   
182 0              7.30  0.786 0.347 0         18      ARID5B   
183 9.45e-162      7.83  0.373 0.272 1.89e-158 18      GABARAPL1
184 2.81e- 64      7.28  0.301 0.429 5.62e- 61 18      HNRNPAB  
185 1.39e- 27      7.96  0.235 0.342 2.78e- 24 18      FKBP3    
186 9.67e- 26     10.5   0.368 0.432 1.93e- 22 18      MRPL33   
187 1.57e- 23      8.50  0.166 0.301 3.13e- 20 18      CACUL1   
188 2.82e- 21      6.60  0.315 0.308 5.63e- 18 18      CCNG2    
189 2.72e-  8      6.71  0.445 0.324 5.45e-  5 18      CBX6     
190 2.14e-  3      7.66  0.275 0.256 1   e+  0 18      GOSR2    
191 0              5.49  0.911 0.247 0         19      MT1F     
192 0              4.93  0.908 0.281 0         19      MT1G     
193 0              4.66  0.989 0.326 0         19      MT1X     
194 0              4.01  0.994 0.29  0         19      MT1E     
195 1.93e-111      9.27  0.673 0.471 3.85e-108 19      H2AFY    
196 8.55e- 61      3.85  0.472 0.297 1.71e- 57 19      RHOD     
197 1.36e- 41     10.9   0.357 0.285 2.72e- 38 19      ANKRD10  
198 9.27e- 17      4.50  0.333 0.272 1.85e- 13 19      THYN1    
199 2.16e-  7      4.46  0.337 0.33  4.32e-  4 19      FAM133B  
200 6.37e-  3      9.45  0.309 0.274 1   e+  0 19      PTGR1    
201 0              8.06  0.914 0.322 0         20      DNAJB1   
202 0              7.59  0.898 0.366 0         20      HSPA8    
203 0              7.28  0.966 0.304 0         20      HSP90AA1 
204 0              6.42  0.682 0.157 0         20      APOBEC3G 
205 0              6.31  0.709 0.292 0         20      PPP1R2   
206 2.91e-212      7.37  0.564 0.262 5.83e-209 20      ELF1     
207 1.73e- 96      6.53  0.111 0.426 3.47e- 93 20      PGP      
208 9.55e- 52      7.59  0.449 0.253 1.91e- 48 20      ODF2L    
209 5.88e- 18      9.25  0.354 0.269 1.18e- 14 20      BUB3     
210 4.81e-  6      9.40  0.191 0.33  9.61e-  3 20      GNL3     
211 5.28e-256      4.46  0.645 0.296 1.06e-252 21      ARL2     
212 2.40e-206      5.88  0.52  0.235 4.81e-203 21      TAF13    
213 1.35e-112     10.4   0.486 0.29  2.69e-109 21      MKLN1    
214 2.58e- 92      5.16  0.202 0.342 5.17e- 89 21      CMTM8    
215 3.89e- 91      4.88  0.481 0.277 7.79e- 88 21      TNFRSF1A 
216 3.51e- 56      8.23  0.212 0.306 7.02e- 53 21      ZNF524   
217 2.14e- 50      6.39  0.409 0.239 4.27e- 47 21      ARPC1B   
218 3.79e- 50      7.70  0.357 0.266 7.58e- 47 21      ACTR10   
219 1.91e- 33      8.37  0.407 0.311 3.82e- 30 21      OAZ2     
220 4.08e- 19      4.60  0.342 0.259 8.15e- 16 21      HMGCL    
221 0              8.53  0.945 0.364 0         22      RPS27    
222 0              6.17  0.999 0.497 0         22      RPS19    
223 0              5.40  0.997 0.477 0         22      RPS18    
224 0              5.21  0.999 0.466 0         22      RPL13A   
225 0              4.52  0.996 0.483 0         22      RPSA     
226 2.41e-121      5.58  0.583 0.339 4.81e-118 22      TIMM50   
227 6.85e- 43      9.20  0.268 0.292 1.37e- 39 22      RUSC1    
228 3.20e- 35      6.77  0.212 0.291 6.40e- 32 22      CCDC12   
229 2.31e- 14      5.54  0.233 0.257 4.61e- 11 22      PTPN2    
230 6.89e- 11      4.85  0.265 0.297 1.38e-  7 22      PAFAH1B2 
231 1.24e-219      4.43  0.596 0.26  2.48e-216 23      RAB27A   
232 8.21e-169      6.43  0.341 0.229 1.64e-165 23      OSTF1    
233 1.35e-148      4.30  0.558 0.217 2.71e-145 23      TTC39C   
234 6.86e- 80      5.01  0.436 0.244 1.37e- 76 23      PPP6R2   
235 3.99e- 58      6.16  0.362 0.243 7.97e- 55 23      SF3A1    
236 2.92e- 34      6.75  0.429 0.356 5.84e- 31 23      GOLGB1   
237 8.29e- 26      4.13  0.22  0.302 1.66e- 22 23      MTCH2    
238 3.27e- 23      3.45  0.302 0.268 6.53e- 20 23      MRPS18C  
239 5.97e- 12      3.53  0.222 0.315 1.19e-  8 23      TMEM208  
240 6.40e-  9      5.43  0.338 0.365 1.28e-  5 23      MRPL18   
241 0              6.12  0.875 0.348 0         24      DNAJA1   
242 0              5.87  0.835 0.397 0         24      IER2     
243 0              5.47  0.973 0.331 0         24      FOS      
244 0              5.27  0.97  0.324 0         24      DUSP1    
245 0              5.25  0.978 0.371 0         24      JUN      
246 1.97e-202      6.03  0.532 0.225 3.94e-199 24      NDRG2    
247 2.20e-175      5.76  0.656 0.299 4.39e-172 24      SERTAD3  
248 1.18e- 58      7.20  0.428 0.31  2.36e- 55 24      PRPF38B  
249 2.36e-  7      5.31  0.322 0.289 4.73e-  4 24      IPO7     
250 1.06e-  4      5.26  0.384 0.343 2.13e-  1 24      PGK1     
251 2.00e- 78      8.39  0.442 0.252 4.00e- 75 25      TBC1D20  
252 5.43e- 33      8.24  0.369 0.262 1.09e- 29 25      SMG1     
253 5.41e- 32     11.0   0.463 0.322 1.08e- 28 25      ACADVL   
254 3.86e- 18      9.26  0.331 0.321 7.73e- 15 25      UBE2J1   
255 4.46e- 15      9.98  0.246 0.374 8.93e- 12 25      SLC25A39 
256 1.81e- 14      8.77  0.296 0.269 3.61e- 11 25      HBP1     
257 2.01e- 14     10.1   0.349 0.262 4.03e- 11 25      NDUFS1   
258 4.04e- 11      8.51  0.287 0.282 8.09e-  8 25      RABEP2   
259 5.97e-  9      9.47  0.431 0.285 1.19e-  5 25      ABCC4    
260 4.50e-  5      8.48  0.325 0.25  9.00e-  2 25      BRAT1    
261 0             11.7   0.887 0.358 0         26      CALM1    
262 0             10.9   0.919 0.256 0         26      GNAI2    
263 0             10.9   0.918 0.361 0         26      ARGLU1   
264 0             10.4   0.914 0.404 0         26      APLP2    
265 0              9.27  0.993 0.243 0         26      SLC9A3R2 
266 0              8.47  0.921 0.473 0         26      SRP14    
267 0              7.68  0.798 0.308 0         26      MYH9     
268 0              7.44  0.979 0.317 0         26      CCDC85B  
269 4.25e-187      8.39  0.647 0.254 8.49e-184 26      IFNGR1   
270 2.08e- 76      8.71  0.535 0.326 4.17e- 73 26      TM9SF2   
271 0             12.6   0.88  0.441 0         27      ATP6V0B  
272 0             11.5   0.887 0.267 0         27      HIF1A    
273 0             11.2   0.757 0.214 0         27      DNASE2   
274 0             10.3   0.999 0.208 0         27      CTSD     
275 0             10.1   0.903 0.204 0         27      FUCA1    
276 0              9.62  0.983 0.178 0         27      LGMN     
277 0              9.52  0.999 0.199 0         27      CTSB     
278 0              9.48  0.761 0.156 0         27      CYP27A1  
279 0              9.01  0.986 0.206 0         27      CREG1    
280 1.85e-299      9.21  0.734 0.258 3.70e-296 27      ELL2     
281 0              3.50  0.76  0.267 0         28      WDR74    
282 0              2.87  0.967 0.379 0         28      RPL10    
283 1.01e-292      2.85  0.657 0.277 2.02e-289 28      ZFP36L2  
284 6.00e-223      2.91  0.602 0.255 1.20e-219 28      TC2N     
285 1.87e-127      5.02  0.496 0.299 3.73e-124 28      PDCD4    
286 2.33e-113      2.81  0.396 0.236 4.66e-110 28      DPP4     
287 1.26e- 59      4.57  0.268 0.424 2.52e- 56 28      H2AFZ    
288 2.90e- 11      4.15  0.388 0.321 5.80e-  8 28      CMC1     
289 5.16e- 11      3.48  0.332 0.317 1.03e-  7 28      TSPYL1   
290 2.72e-  5      4.96  0.192 0.292 5.44e-  2 28      IFITM2   
291 0             13.7   0.941 0.447 0         29      GNAS     
292 0             10.0   0.924 0.403 0         29      MYL6B    
293 0              9.95  0.832 0.322 0         29      SERINC2  
294 0              9.92  0.881 0.37  0         29      ACTN4    
295 0              9.43  0.971 0.346 0         29      SPINT1   
296 0              9.18  0.889 0.262 0         29      MIPEP    
297 7.37e-308      9.52  0.837 0.395 1.47e-304 29      PRDX6    
298 2.08e-274     11.0   0.759 0.322 4.16e-271 29      VAMP8    
299 1.60e-273      9.54  0.81  0.342 3.20e-270 29      FLNB     
300 4.71e- 70      9.91  0.618 0.338 9.42e- 67 29      FDFT1    
301 0             14.7   0.975 0.378 0         30      SSR4     
302 0             12.3   0.975 0.282 0         30      FKBP11   
303 0              9.51  0.926 0.352 0         30      HERPUD1  
304 0              9.23  0.917 0.334 0         30      SEC11C   
305 0              8.54  0.86  0.372 0         30      XBP1     
306 6.70e-279      6.14  0.759 0.295 1.34e-275 30      SDF2L1   
307 8.17e-213      9.69  0.737 0.396 1.63e-209 30      HSP90B1  
308 6.62e- 80      6.53  0.618 0.386 1.32e- 76 30      PPIB     
309 3.80e- 20      6.22  0.188 0.254 7.59e- 17 30      ZNF692   
310 3.46e- 16      6.67  0.403 0.291 6.92e- 13 30      TP53INP1 
311 1.93e-285      5.63  0.701 0.236 3.86e-282 31      RHOG     
312 1.77e- 77      4.95  0.074 0.306 3.54e- 74 31      ARMC10   
313 5.26e- 43      5.86  0.41  0.25  1.05e- 39 31      SNX2     
314 5.07e- 24     10.6   0.291 0.422 1.01e- 20 31      DBI      
315 6.66e- 22     10.1   0.361 0.307 1.33e- 18 31      CERS4    
316 4.55e- 21      4.93  0.135 0.292 9.09e- 18 31      RDX      
317 1.11e- 17      5.00  0.259 0.267 2.21e- 14 31      KBTBD3   
318 3.25e-  9      6.96  0.217 0.278 6.49e-  6 31      NIP7     
319 1.86e-  5      4.63  0.283 0.26  3.71e-  2 31      MAGOHB   
320 4.23e-  3     11.1   0.477 0.41  1   e+  0 31      TPD52    
321 0             17.1   1     0.432 0         32      DSTN     
322 0             12.8   0.95  0.348 0         32      MGST3    
323 0             11.6   0.992 0.262 0         32      CSRP1    
324 0             10.9   0.997 0.225 0         32      CRYAB    
325 0             10.5   0.993 0.271 0         32      ADIRF    
326 0              9.11  0.999 0.341 0         32      CD151    
327 0              8.96  0.997 0.202 0         32      SOD3     
328 0              8.32  0.955 0.245 0         32      ILK      
329 0              8.26  0.986 0.319 0         32      LPP      
330 1.87e- 62     10.5   0.438 0.326 3.75e- 59 32      UAP1     
331 0             13.2   0.953 0.48  0         33      NUCKS1   
332 0              9.88  0.993 0.442 0         33      HMGN2    
333 0              9.71  0.942 0.424 0         33      H1FX     
334 0              9.48  0.924 0.402 0         33      UCP2     
335 4.76e-223     10.6   0.828 0.318 9.53e-220 33      IMMP1L   
336 2.38e-194      9.36  0.829 0.377 4.76e-191 33      KPNB1    
337 4.90e-174      9.99  0.745 0.263 9.79e-171 33      SNRNP40  
338 3.89e-142     10.1   0.688 0.242 7.78e-139 33      RPA2     
339 8.99e- 70      9.86  0.67  0.414 1.80e- 66 33      PKM      
340 9.26e- 57     12.2   0.618 0.319 1.85e- 53 33      NASP     
341 0             10.6   0.938 0.226 0         34      STXBP2   
342 0              8.30  0.96  0.276 0         34      LITAF    
343 0              8.08  0.972 0.275 0         34      NAMPT    
344 0              7.49  0.922 0.339 0         34      SLC25A37 
345 0              7.32  0.912 0.216 0         34      NINJ1    
346 3.16e-102      7.82  0.62  0.269 6.32e- 99 34      VPS37C   
347 3.91e- 59      7.71  0.127 0.355 7.83e- 56 34      FAM136A  
348 1.24e- 32      8.27  0.478 0.293 2.48e- 29 34      CAMKK2   
349 7.93e- 11      9.38  0.305 0.247 1.59e-  7 34      MDM2     
350 6.00e-  3     12.2   0.342 0.312 1   e+  0 34      OS9      
351 1.55e- 41      9.42  0.293 0.331 3.10e- 38 35      MVD      
352 1.95e- 19     12.0   0.591 0.392 3.91e- 16 35      DHRS7    
353 2.41e- 17      8.17  0.375 0.351 4.82e- 14 35      KIF22    
354 2.37e- 14     11.2   0.521 0.316 4.74e- 11 35      CHD3     
355 2.40e-  8      9.39  0.433 0.344 4.80e-  5 35      C8orf82  
356 1.20e-  7     11.0   0.539 0.413 2.41e-  4 35      CYB5A    
357 2.29e-  7      8.96  0.529 0.408 4.58e-  4 35      TMEM14C  
358 6.68e-  7      8.40  0.43  0.31  1.34e-  3 35      ERLEC1   
359 2.47e-  5      8.97  0.448 0.331 4.95e-  2 35      ARG2     
360 1.75e-  4      9.05  0.28  0.291 3.50e-  1 35      UROS     
361 5.25e-236      6.68  0.864 0.411 1.05e-232 36      CTNNB1   
362 9.77e-103      5.09  0.668 0.314 1.95e- 99 36      CDC42SE1 
363 6.97e- 83      5.20  0.838 0.513 1.39e- 79 36      UQCRQ    
364 2.39e- 27      5.82  0.535 0.334 4.78e- 24 36      BEX2     
365 4.37e- 17      5.41  0.438 0.298 8.74e- 14 36      STAG2    
366 1.91e- 14      5.54  0.556 0.405 3.81e- 11 36      VMP1     
367 9.27e- 10      6.45  0.402 0.251 1.85e-  6 36      RRAS2    
368 1.10e-  9      7.95  0.441 0.298 2.20e-  6 36      NFIX     
369 1.12e-  4      5.89  0.391 0.25  2.24e-  1 36      ORAI3    
370 4.90e-  3      5.96  0.391 0.339 1   e+  0 36      C9orf16  
371 8.53e- 76      8.00  0.915 0.472 1.71e- 72 37      EPCAM    
372 3.43e- 72      8.18  0.868 0.399 6.86e- 69 37      DHCR24   
373 2.46e- 56     10.3   0.743 0.244 4.92e- 53 37      TRAPPC12 
374 6.50e- 44      5.99  0.824 0.412 1.30e- 40 37      CALR     
375 1.28e- 26     10.4   0.629 0.321 2.56e- 23 37      KDM5B    
376 2.20e- 26      6.23  0.471 0.245 4.41e- 23 37      DYNLT3   
377 2.16e- 19     11.3   0.544 0.239 4.33e- 16 37      CHD1L    
378 3.45e- 17      8.23  0.529 0.329 6.91e- 14 37      CAST     
379 5.76e- 11     11.8   0.548 0.252 1.15e-  7 37      AP2A2    
380 2.70e-  6      6.68  0.397 0.302 5.39e-  3 37      TAPBP    
381 8.71e- 76      6.72  0.977 0.28  1.74e- 72 38      ISG15    
382 5.84e- 30      5.99  0.762 0.25  1.17e- 26 38      IFI35    
383 4.60e- 27      1.43  0.777 0.309 9.21e- 24 38      PLSCR1   
384 6.26e- 15      2.96  0.6   0.311 1.25e- 11 38      PSME2    
385 4.90e- 13      1.03  0.631 0.358 9.81e- 10 38      PSME1    
386 1.53e-  5      1.82  0.438 0.279 3.06e-  2 38      BIRC2    
387 7.12e-  5      0.901 0.392 0.266 1.42e-  1 38      XPNPEP1  
388 1.81e-  4      6.69  0.462 0.283 3.62e-  1 38      PDK2     
389 1.69e-  3      2.70  0.469 0.308 1   e+  0 38      C15orf61 
390 2.15e-  3      1.86  0.315 0.221 1   e+  0 38      MTMR14   
391 8.63e- 24      1.53  0.976 0.375 1.73e- 20 39      HSPB1    
392 9.40e- 21      4.24  0.976 0.415 1.88e- 17 39      CLDN4    
393 6.72e- 19      4.11  0.905 0.342 1.34e- 15 39      MT1X     
394 7.61e- 14      2.87  0.881 0.284 1.52e- 10 39      GSTO2    
395 1.16e- 13      5.22  0.81  0.261 2.32e- 10 39      AUH      
396 2.53e-  8      7.97  0.762 0.263 5.06e-  5 39      MT1F     
397 5.22e-  8      3.48  0.857 0.452 1.04e-  4 39      CLDN3    
398 8.76e-  8      4.67  0.905 0.493 1.75e-  4 39      KRT18    
399 1.22e-  7      1.65  0.714 0.317 2.45e-  4 39      ADH5     
400 2.52e-  7      2.88  0.643 0.243 5.04e-  4 39      TIMM23 
XpelC commented 1 year ago
  • color UMAP by cell mitochondrial
image
XpelC commented 1 year ago
  • by sample
image
XpelC commented 1 year ago
  • 10x vs smart-seq
image
XpelC commented 1 year ago
  • total RNA
image
stemangiola commented 1 year ago

All looks good you can proceed.

XpelC commented 1 year ago
  • Check the cell marker of clusters with the following command pbmc.markers %>% group_by(cluster) %>% top_n(n = 10, wt = avg_log2FC) -> top10 DoHeatmap(pbmc, features = top10$gene) + NoLegend()

Will the graph looks better if I merged similar cluster first, then find the variable feature?

image
stemangiola commented 1 year ago

Will the graph looks better if I merged similar cluster first, then find the variable feature

Export the plot in pdf with extremely high height, and label each cluster, some of them will have same identify. each gene name should be visible and non overlapping. You will have to spend some hours doing annotation. Hopefully tomorrow you will have finished.

XpelC commented 1 year ago

Export the plot in pdf with extremely high height, and label each cluster, some of them will have same identify. each gene name should be visible and non overlapping. You will have to spend some hours doing annotation. Hopefully tomorrow you will have finished.

So can I merge different cluster before finding markers?

stemangiola commented 1 year ago

Export the plot in pdf with extremely high height, and label each cluster, some of them will have same identify. each gene name should be visible and non overlapping. You will have to spend some hours doing annotation. Hopefully tomorrow you will have finished.

So can I merge different cluster before finding markers?

But you merge based on what? if you are confident about cluster identity Before merging, you can merge. But the heatmap if useful for exactly check cluster identity.

XpelC commented 1 year ago
  • DoHeatmap(pbmc, features = top10$gene) + NoLegend()

Since there are too many features when the cluster number is 40. I find to divide them into 4 groups might work, and will let it run overnight.

cluster 0-9 image

cluster 10-19 image

cluster 20-29 image

cluster 30-39 image

stemangiola commented 1 year ago

These subdivisions were made based on different macro clusters? Or based on what? If they were based just on an ordinal subdivision, this is not the right way to do it.

In this case you should 1) group the 40 clusters in bigger cluster based on the consensus identity of the original annotation (e.g. cluster 1 t-memory cd8, cluster 6 t memory cd4; then cluster 1 and 6 get the label t memory) 2) much much less cluster of which you know the rough identity, you calculate the makers

Before all that, could you please paste here a table with the best cluster label, given by you just looking at the original annotation?

Thanks.

XpelC commented 1 year ago

These subdivisions were made based on different macro clusters? Or based on what? If they were based just on an ordinal subdivision, this is not the right way to do it.

Just because 40 clusters have too many features which is impossible to see a clear heat map. So I plot them every time with 10 of them. I'll send you the file with best cluster label.

stemangiola commented 1 year ago

Just because 40 clusters have too many features which is impossible to see a clear heat map. So I plot them every time with 10 of them.

This is the wrong way to do it.

In this case you should

  1. group the 40 clusters in bigger cluster based on the consensus identity of the original annotation (e.g. cluster 1 t-memory cd8, cluster 6 t memory cd4; then cluster 1 and 6 get the label t memory)
  2. much much less cluster of which you know the rough identity, you calculate the makers

Or divide the 40 cluster in 5/6 macroclusters of epithelial, t cells, b cells, fibro, etc.. and compose 5/6 heatmaps

XpelC commented 1 year ago

Or divide the 40 cluster in 5/6 macroclusters of epithelial, t cells, b cells, fibro, etc.. and compose 5/6 heatmaps

T cell

image

epithelial

image

endothelial

image

fibroblast

image

other cells (B, monocyte, mast, adipocyte)

image
stemangiola commented 1 year ago

Amazing, the only thing left is to give the specific cluster identities. Tomorrow after you have done that we should meet.

I suggest to use SingleR on your clusters for a final confirmation.