Open lishensuo opened 1 year ago
I further read your code of ‘DrugScore.R’. I thought it is not reasonable to directly get the overlapped of genes over all cluster degs. Some cell types colud show little transcriptomic change between disease and healthy status. That would cause 0 overlapped genes whic could leading NAN value for Drug.therapeutic.score
Hi, Thanks for the detailed review. The cluster ratio is multiplied with negative log10 transformed FDR but not summed. Then the values are summed by the drug. They are just intermediate temporary values for calculating the drug score. The final formulation for the drug score is at the end of this code. You will find all values mentioned in the paper are correctly put into the calculation. The multi-step calculation using intermediate temporary values is for saving computational resources. As to the gene overlap step, it's not overlapping genes over all clusters but for the selected clusters that may be correlated with the disease. If a cluster has no change between the disease and healthy status, it implies this cluster may not contribute to the disease. I would suggest not including it in the calculation of the drug score. Please let me know if you have a further question here or send me an email at hebinghb@gmail.com Thanks!
Thank you for your response. But I still think its seemly problematic in your code which is not consistent with your article formula. Like I said above, you select the overlapped gene not the every cluster's deg. When some cluster has not any significant DEG, the intersected genes will be zero which would leading to NAN value for Drug.therapeutic.score For example, The following is the slightly modified code in the README.md (step5)
## (1)the normal condition when execute your code as lots of deg in every cluster
Drug.score<-DrugScore(SC.integrated=SC.integrated,
Gene.data=Gene.list,
Cell.type=NULL,
Drug.data=Drug.ident.res,
FDA.drug.only=TRUE,
Case=Case,
Tissue="breast",
GSE92742.gctx=GSE92742.gctx.path,
GSE70138.gctx=GSE70138.gctx.path)
head(Drug.score)
# Drug.therapeutic.score P.value FDR
# abiraterone 9.056519e-07 0.5554614 1.0000000
# acamprosate 1.175191e-06 0.2618356 0.8065834
# acarbose 7.654915e-07 0.2343568 0.7639510
# acebutolol 1.045812e-06 0.9632485 1.0000000
# aceclidine 1.088939e-06 0.9999216 1.0000000
# aceclofenac 1.056594e-06 0.9958364 1.0000000
## (2) but if one cluster do not have any significant DEG, score will be NAN
Gene.list$C1$adj.P.Val=1
Gene.list$C1$P.Value=1
Drug.score2<-DrugScore(SC.integrated=SC.integrated,
Gene.data=Gene.list,
Cell.type=NULL,
Drug.data=Drug.ident.res,
FDA.drug.only=TRUE,
Case=Case,
Tissue="breast",
GSE92742.gctx=GSE92742.gctx.path,
GSE70138.gctx=GSE70138.gctx.path)
head(Drug.score2)
# Drug.therapeutic.score P.value FDR
# abiraterone NaN 0.5554614 1.0000000
# acamprosate NaN 0.2618356 0.8065834
# acarbose NaN 0.2343568 0.7639510
# acebutolol NaN 0.9632485 1.0000000
# aceclidine NaN 0.9999216 1.0000000
# aceclofenac NaN 0.9958364 1.0000000
Hi there, as I replied above, the code is consistent with the formula. Regarding the next question, you may revise the code to skip this step if you want to repurpose a drug for a cluster that doesn't have differential genes between the disease and healthy sample. Please let me know if you need help.
Thanks, I would study your code carefully. May I ask if it means that if on conseverd cluster without degs exsit , the combined Drug.therapeutic.score cannot be calculated?
This method is repurposing drugs that reverse the differential expressions in the disease. If there's no differential expression, there is no need for a drug score.
Sorry, I may did not speak clearly. The combined score is based on several clusters' results as your method suggested. However, if one of the clusters is conserved without deg, the combined score will be NAN as the above modified code?
Same thing. The method is finding drugs that affect ALL the clusters you added. But it's not a simple sum of drug scores from every selected cluster. If you put in a "NAN" cluster without DEG, it indeed will report a NAN drug score. Because it finds there is no drug can affect ALL the clusters it received. I still suggest not adding a cluster without DEG.
Thanks for your patient answer, I seem to understand some. I will read the article and your response again. Thanks again!
Hi, I have learnd a lot from your package and the article, an awesome method for drug repurposing reseach. But, I am a little confused to understand the combined drug score calculation formula. Here is the question: As the original paper said, you would consider the product of (1) culster ratio; (2) drug fdr pvalue; (3) reversed gene ration for each cell cluster and then sum them. However, I looked into your package code of
DrugScore.R
script . In line 68:Drug.coverage <- tapply(Drug.list$w.size, Drug.list$Drug,sum)
, you first likely clalculated the product of (1) culster ratio; (2) drug fdr pvalue for for each cell cluster and then sum them and then sum them. Next,you found the intersected genes among all clusters and corresponidng drug interrupted genes. This is not consistent with the paper method description.I don't know if I misunderstood the code or the article. Thank you!