It would be useful if we could refactor stats.py into a class API so that we could drop in different solvers.
Right now, the implementation of the NetBFE analytical solver (which has several limitations, such as not being able to handle both forward and reverse estimates for the same edge) is implemented in mle().
As an example, we could have a NetworkSolver base class with methods like
mle() : solve for the maximum likelihood estimate of absolute free energies of all nodes
statistics() : compute various statistics (RMSE, MUE, etc) and their uncertainties
optimal_allocation(method=X) : predict edge (and node) weights that would optimally allocate effort to reduce uncertainty
and then NetBFESolver could be a derived class implementation of this.
We may also want to clean up the API within stats.py at the same time. Right now, it just assumes the graph is labeled in a particular way, and generates new labels when mle() is called. The bootstrap() method also generates multivariate normal data from the MLE solution and computes various statistics, which may not be what we want.
I'm particularly interested in implementing a drop-in replacement for NetBFESolver that uses the same API, but numerically solves for the optimal free energies. This would provide several additional features:
A network could include forward and reverse estimates for the same edge, and this information would be used appropriately
The user could select heavier-tailed likelihood functions than the Gaussian that NetBFE assumes
Absolute free energy measurements on combinations of microstates could be specified, which would enable us to include experimental binding free energy measurements for molecules with multiple protonation/tautomer states
It would be useful if we could refactor
stats.py
into a class API so that we could drop in different solvers.Right now, the implementation of the NetBFE analytical solver (which has several limitations, such as not being able to handle both forward and reverse estimates for the same edge) is implemented in
mle()
.As an example, we could have a
NetworkSolver
base class with methods likemle()
: solve for the maximum likelihood estimate of absolute free energies of all nodesstatistics()
: compute various statistics (RMSE, MUE, etc) and their uncertaintiesoptimal_allocation(method=X)
: predict edge (and node) weights that would optimally allocate effort to reduce uncertaintyand then
NetBFESolver
could be a derived class implementation of this.We may also want to clean up the API within
stats.py
at the same time. Right now, it just assumes the graph is labeled in a particular way, and generates new labels whenmle()
is called. Thebootstrap()
method also generates multivariate normal data from the MLE solution and computes various statistics, which may not be what we want.I'm particularly interested in implementing a drop-in replacement for
NetBFESolver
that uses the same API, but numerically solves for the optimal free energies. This would provide several additional features: