Closed mattapow closed 2 years ago
When initialising the optimiser on ml.py line13, self.dists_data is supposedly a non-leaf tensor with grad_fn = UnbindBackward0.
Implemented in ml.py.
Invoke as e.g. dodo --infer ml --taxa 17 --epochs 1000 --learn 0.01
Likelihood is jumpy. Check autodiff graph with torchviz
Here's the result gradient graph on ml.closure(): torchviz.gv.pdf Most of it is the peeler.nj() algorithm, and calculate-likelihood is at the end.
The likelihood is jumpy because sometimes soft neighbour joining peeler.nj(pdm, 0.0001) outputs an incomplete tree (as seen in the saved .tree file). Further testing of test_soft_nj() required.
Soft neighbour joining outputs complete tree. Likelihood is less jumpy. Questions:
Typically how much noise can we add to the dissimilarity matrix before we jump basins? Noise vs max likelihood.
Does using hydra with the true distances get into the basin of the global optimum?
Should we introduce a basin hopping (global algorithm with local maximisation)?
To get benchmarks, how does it compare to IQ-tree or RAxML, (as well as MrBayes)?
How does it compare to using pairwise similarities or Dasgupta's cost with Chami's geodesic connection?
Alternatives to NJ: family joining, ...
Try annealing the likelihood.
Use soft NJ to optimise distance matrix. Loss = minimum evolution. Then sample nearby trees, use Laplace approximation from torch.Hessian to get covariance for sampling.
Not differentiable. Better to use embedding.
As a subproblem of Dodonaphy: