markziemann / Gene-function-imputation

Gene function imputation by coexpression
3 stars 0 forks source link

Cross validation code to check for imputation #26

Open megan-soria opened 4 years ago

megan-soria commented 4 years ago

To check for specificity and sensitivity.

use F1 score as basis.

megan-soria commented 3 years ago

commit 0b2316305d65ebcd72421175cf992729528a4b5f

Logic to get True and False (+), (-): Subtraction: Reference data frame - Predicted (Imputed data frame) Gives False Neg and Pos

Reference Predicted Diff Result
Outcome 1 1 1 0 Inconclusive
Outcome 2 0 0 0 Inconclusive
Outcome 3 1 0 1 False -
Outcome 4 0 1 -1 False +

Addition: Reference data frame + Predicted (Imputed data frame) Gives Ture Neg and Pos

Reference Predicted Sum Result
Outcome 1 1 1 2 True +
Outcome 2 0 0 0 True -
Outcome 3 1 0 1 Inconclusive
Outcome 4 0 1 1 Inconclusive

Therefore, a combination of the two operations will be used.

megan-soria commented 3 years ago

Checked for consistency:

# Check if the totals are equal

total_count_Cl1 <- 86*317 # Dimensions of Cluster 1 count_statsCl1 <- 47 + 20558 + 6002 + 655 # sum of TP, TN, FP, and FN respectively total_count_Cl1 == count_statsCl1 # True

total_count_Cl2 <- 92*132 # Dimensions of Cluster 1 count_statsCl2 <- 174 + 6671 + 4272 + 1027 # sum of TP, TN, FP, and FN respectively total_count_Cl2 == count_statsCl2 # True

total_count_Cl50 <- 80*493 # Dimensions of Cluster 1 count_statsCl50 <- 51 + 30044 + 8457 + 888 # sum of TP, TN, FP, and FN respectively total_count_Cl50 == count_statsCl50 # True

megan-soria commented 3 years ago

commit 1c52f79dffb59e5f7cabfbd38454946958f6b9ba

added fuctions for kfold validation: for review