Open swiri021 opened 3 years ago
@swiri021 I don't think RR can consider as a middle of the stage. As far as I know, in clinics, CIS is just experienced the first time neurodegeneration. If the patient had an attack again, it considers as the RR stage. A patient can have more attacks or have no more attacks after the CIS stage. That's why I think the data is good for disease mechanisms study but may not be good for diagnostic biomarkers.
wow... it's very interesting... Is the state single gene? or gene set? Is there a list of the genes?
wow... it's very interesting... Is the state single gene? or gene set? Is there a list of the genes?
Activation score is calculated by gene-sets(gene signatures), and DEG is one list of DEG as you know... So, here, DEG model is using one gene as one features and Activation Score model is using one pathway score as one feature.
wow... it's very interesting... Is the state single gene? or gene set? Is there a list of the genes?
I think this may occur because of features numbers, and I downed DEG fold change threshold to 0.58(1.5 fold) and performance is better than 2 fold threshold. But still, the activation score model has narrower discrepancies of AUC between validation-set and test-set. Anyway, I am going to dig pathway features deeply.
So, the activation score is based on pathways, and the gene set is from DEG (DESeq2 results)? Do you have a list of pathway and/or gene sets? I'm curious what pathways/genes are included.
So, the activation score is based on pathways, and the gene set is from DEG (DESeq2 results)? Do you have a list of pathway and/or gene sets? I'm curious what pathways/genes are included.
Yes, we have a list of pathways, and the activation score was calculated by using MSigDB. Additionally, I will let you know if we have more interesting points here.
@kicheolkim @lacuss I got a weird signal in the data: Notebook link That signal is related to 'Sex' of patients, unfortunately, it is related to RR and CIS category significantly..... maybe another noise?
@kicheolkim @lacuss I got a weird signal in the data: Notebook link That signal is related to 'Sex' of patients, unfortunately, it is related to RR and CIS category significantly..... maybe another noise?
So…I looked up about MS at Mayo clinic website. Seems that “Sex-Women are more than two to three times as likely as men are to have relapsing-remitting MS” Maybe this is the reason? Should dig up more about the correlation I think.
Yeah, I have seen a similar review paper: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3707353/ , but still, we can't say that RR and CIS could be related on sex factor because of the case number. Any thought? @kicheolkim do we need to go some further with gender information? Top pathways are here(for that clustering):
RUNNE_GENDER_EFFECT_UP PYEON_CANCER_HEAD_AND_NECK_VS_CERVICAL_DN GSE5099_DAY3_VS_DAY7_MCSF_TREATED_MACROPHAGE_DN GSE3982_MEMORY_CD4_TCELL_VS_BCELL_UP GOMF_HISTONE_DEMETHYLASE_ACTIVITY_H3_K4_SPECIFIC
@kicheolkim @lacuss I got a weird signal in the data: Notebook link That signal is related to 'Sex' of patients, unfortunately, it is related to RR and CIS category significantly..... maybe another noise?
So…I looked up about MS at Mayo clinic website. Seems that “Sex-Women are more than two to three times as likely as men are to have relapsing-remitting MS” Maybe this is the reason? Should dig up more about the correlation I think.
These genes are outliers to cluster male and femal. When these genes removed from the list, clustering by Sex has been completely gone. Let me know if these genes are interesting or need more investigation. (Sorry for not converting EntrezID)
EntrezID | pval | fc |
---|---|---|
6192 | 3.013055e-25 | 5.829828 |
8284 | 3.013055e-25 | 5.683377 |
5616 | 3.013055e-25 | 5.566646 |
8653 | 3.013055e-25 | 5.436171 |
8287 | 3.013055e-25 | 5.414928 |
246126 | 3.013055e-25 | 5.140104 |
7404 | 3.013055e-25 | 4.902533 |
7544 | 3.013055e-25 | 3.396781 |
9086 | 3.013055e-25 | 2.624308 |
9087 | 3.013055e-25 | 1.204376 |
input | name | symbol | alias (first 5) | HGNC |
---|---|---|---|---|
8653 | DEAD-box helicase 3 Y-linked | DDX3Y | DBY | HGNC:2699 |
9086 | eukaryotic translation initiation factor 1A Y-linked | EIF1AY | eIF-4C | HGNC:3252 |
8284 | lysine demethylase 5D | KDM5D | HY, HYA, JARID1D, SMCY | HGNC:11115 |
5616 | protein kinase Y-linked (pseudogene) | PRKY | PRKXP3, PRKYP | HGNC:9444 |
6192 | ribosomal protein S4 Y-linked 1 | RPS4Y1 | RPS4Y, S4 | HGNC:10425 |
9087 | thymosin beta 4 Y-linked | TMSB4Y | TB4Y | HGNC:11882 |
246126 | taxilin gamma pseudogene, Y-linked | TXLNGY | CYorf15A, CYorf15B, TXLNG2P | HGNC:18473 |
8287 | ubiquitin specific peptidase 9 Y-linked | USP9Y | DFFRY, SPGFY2 | HGNC:12633 |
7404 | ubiquitously transcribed tetratricopeptide repeat containing, Y-linked | UTY | KDM6AL, KDM6C, UTY1 | HGNC:12638 |
7544 | zinc finger protein Y-linked | ZFY | ZNF911 | HGNC:1 |
As we expected, all of genes are Y-linked
Sorry for the late reply. I was busy becoming a father this week :) MS is more prevalent in women, and immune cells are strongly affected by gender. I used gender and age as a covariate in my analysis.
@kicheolkim It seems CIS and RR has DiseaseDuration difference significantly (notebook link), so if we find features that are relevant to early and late markers by using DiseaseDuration, those might be differential expressed genes between RR and CIS. Do you think CIS and RR have actual meaning for the early and late stages of MS? for example, CIS might be really the early stage of MS and RR for the medium stage.