vahuynh / dynGENIE3

Semi-parametric approach for the inference of gene regulatory networks from time series of expression data
21 stars 17 forks source link

Pipeline for generating a AUROC plot based off of dynGENIE3 Results? #4

Closed Rohak72 closed 7 months ago

Rohak72 commented 1 year ago

Hi,

I was wondering if there was a way to visualize an AUROC plot (TPs vs. FPs) for a given gene regulatory network (GRN) constructed by the dynGENIE3 framework. If there isn't an easily available function, are there other ways to perform such a procedure? Any clarification would be greatly appreciated!

Best, Rohak

vahuynh commented 1 year ago

Hi,

If you are using Python, you can use scikit-learn to do that.

Let's say VIM is the array of edge scores returned by dynGENIE3 (with VIM[i, j] being the score for the edge directed from the i-th gene to the j-th gene), and gold_standard is the array containing the true edges (gold[i,j]=1 if there is an edge from the i-th gene to the j-th gene, and 0 otherwise).

To visualize the ROC and precision-recall curves:

import numpy as np
from sklearn.metrics import roc_curve, precision_recall_curve
import matplotlib.pyplot as plt

VIM = np.reshape(VIM, -1)
gold_standard = np.reshape(gold_standard, -1)

# ROC curve
fpr, tpr, thresholds = roc_curve(gold_standard, VIM)
plt.plot(fpr, tpr)
plt.show()

# Precision-recall curve
precision, recall, thresholds = precision_recall_curve(gold_standard, VIM)
plt.plot(recall, rpecision)
plt.show()
Rohak72 commented 1 year ago

Hi,

Thanks so much for your reply, I really appreciate it! This all makes sense to me, but I was wondering: when creating the 'gold_standard' array, what constitutes being designated as a true edge? In other words, what metric would be used to determine if there is indeed an edge connecting a pair of genes? Would this simply involve looking for values greater than 0 in the VIM matrix or does the ranking matrix come into play here?

Thus far, I've defined a true edge as a score greater than 0 in the VIM matrix, but that yielded me with a perfect AUROC plot (90 degree angle). If you have any clarification regarding this, it would be.a great help!

Thanks again, Rohak

vahuynh commented 1 year ago

Hi,

In order to compute the ROC curve, you need to know the true network (i.e. the true labels of the edges), and the gold standard array is the adjacency matrix of this true network.

Vân Anh