Q : recording # of function calls

ocramz commented 2 years ago

I'd like to benchmark this library against other ones, and one of the natural metrics would be the total number of evaluations (summing ovet iterations, cores, etc.) . Is there a flag (combination) that makes the library report this?

MilesCranmer commented 2 years ago

You can fix the # of mutations (=(populations) (niterations) (ncyclesperiteration) * (npop/topn)), but not the number of evaluations. This is due to various reasons: (1) BFGS is used internally for constant optimization, and this has an adaptive number of steps; (2) evaluations are cached - e.g., a simplified or unchanged expression will use the same loss (should these be counted every time the cache is used, or just once?).

What you can do, is fix the runtime, using timeout_in_seconds - this will return once the number of seconds is passed. My subjective view is runtime is a better comparison point anyways, since it is (1) what a user actually cares about, (2) rewards algorithms which are intrinsically parallelizable, and (3) wouldn't be artificially biased towards algorithms which spend a longer time "thinking" about what mutation to do next. But again, this is subjective!

ocramz commented 2 years ago

Thank you Miles, I suspected the internal setup would be something similar. In my specific case, I need to compare sample efficiencies/learning curves, so this complicates things. Besides, wall-clock benchmarks are affected by variance one cannot control.

On Wed, 9 Mar 2022 at 01:03, Miles Cranmer @.***> wrote:

You can fix the # of mutations (=(populations) (niterations) (ncyclesperiteration) * (npop/topn)), but not the number of evaluations. This is due to various reasons: (1) BFGS is used internally for constant optimization, and this has an adaptive number of steps; (2) evaluations are cached - e.g., a simplified or unchanged expression will use the same loss (should these be counted every time the cache is used, or just once?).

What you can do, is fix the runtime, using timeout_in_seconds - this will return once the number of seconds is passed. My subjective view is runtime is a better comparison point anyways, since it is (1) what a user actually cares about, (2) rewards algorithms which are intrinsically parallelizable, and (3) wouldn't be artificially biased towards algorithms which spend a longer time "thinking" about what mutation to do next. But again, this is subjective!

— Reply to this email directly, view it on GitHub https://github.com/MilesCranmer/SymbolicRegression.jl/issues/74#issuecomment-1062410188, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABNBDKFQ5EBRQ2L4QYLQSTLU67TEXANCNFSM5QHNRHQA . You are receiving this because you authored the thread.Message ID: @.***>

cobac commented 2 years ago

See also https://github.com/MilesCranmer/SymbolicRegression.jl/issues/33 for a similar discussion. There is an undocumented recorder option that allows access to all equations and losses throughout runtime.

MilesCranmer commented 2 years ago

@ocramz - the pull request #104 fixes this, and adds a max_evals parameter.

Internally in the SymbolicRegression search, you can now access the number of evals with the variable num_evals, which is a vector of vector of floats. (first axis is output dimension, and second axis is per population). Therefore sum(sum, num_evals) is the total number of evals.

MilesCranmer / SymbolicRegression.jl

Q : recording # of function calls #74