Right now to pick a value of kappa, users have to manually perform fitting over and over again with different kappas until they achieve the target syllable durations. We should automate this by making a kappa_scan function. There are two ways to approach this:
1) We could use a simulated annealing-like approach that algorithmically generates kappa proposals and hones in on the target value (as a human would do)
2) We could use a simpler parameter-scan approach that systematically tests log-spaced kappa values between some user-specified min and max. The function could then plot the results and print the kappa value that was closest to the target, along with the median duration associated with this kappa.
I lean toward (2) since it sounds simpler to implement and understand. If we go that route, here are some additional considerations:
Should the min, max and grid-spacing be set in the config or specified at runtime? I had originally included the relevant variables in the config (they are currently commented out).
The number of iterations for each kappa could be fixed ahead of time, or could be determined dynamically based on when the median duration stabilizes. Maybe we could use an API similar to scikit-learn where there's a tol and a max_iter.
The scanning function should produce a figure with two subplots:
one showing the median duration (y-axis) vs. fitting iteration (x-axis) for each tested kappa value (perhaps coloring the lines and including a colorbar). This will show whether the durations converged within the specified number of iterations.
a second plot showing median duration (y-axis) vs. kappa (x-axis), along the target duration (hline) and selected kappa value (vline). This will show whether the min, max, and spacing were sufficient to hit the target.
Should we also save the results to disk? I'm ambivalent. Maybe it's a enough to just make the plot and save that to disk.
To make the function idiot-proof, we could include the following checks and warnings:
if the target duration falls outside the range of durations achieved through the scan, we could issue a warning that prompts the user to try again and perhaps suggests a new min/max.
check if the median durations stabilize for each kappa value. If not, issue a warning telling the user to try again with more iterations
Once this function is written, we will need to update the tutorial notebook and colab notebook accordingly.
Right now to pick a value of
kappa
, users have to manually perform fitting over and over again with different kappas until they achieve the target syllable durations. We should automate this by making akappa_scan
function. There are two ways to approach this:1) We could use a simulated annealing-like approach that algorithmically generates kappa proposals and hones in on the target value (as a human would do)
2) We could use a simpler parameter-scan approach that systematically tests log-spaced kappa values between some user-specified min and max. The function could then plot the results and print the kappa value that was closest to the target, along with the median duration associated with this kappa.
I lean toward (2) since it sounds simpler to implement and understand. If we go that route, here are some additional considerations:
tol
and amax_iter
.Once this function is written, we will need to update the tutorial notebook and colab notebook accordingly.