[x] in place of ordinary gradient descent, try Levenberg–Marquardt to see if better convergence is possible (fewer examples of parameters going out of bounds)
see if global optimisation methods can replace current initial guess heuristics (but neural ODE's may have too may params for this)
repeat model comparisons (see examples/04_model_battle) under different conditions:
single holdout instead of 2; 3 holdouts, as in the Laleh study
restrict to different study arms, or normalise errors within study arms, as in the Laleh study
determine if increasing n_iterations changes outcomes
restrict to patient records with "fluctuating" response (but interpret very carefully, as this categorisation uses the whole sequence, including holdout, unless we re-categorise ourselves).
investigate clues for parameters leaving the boundary, despite loss penalty is some cases
try out more or less complex neural network choices (more complex seems doubtful given likely overfitting)
investigate why 1D neural network is not beaten by bertalanffy and yet draws with all the classical models. Sounds like 1D neural network and bertalanffy may perform better on different parts of the population.
n_iterations
changes outcomes