MilesCranmer / PySR

High-Performance Symbolic Regression in Python and Julia
https://astroautomata.com/PySR
Apache License 2.0
2.19k stars 207 forks source link

[Code cleanup] make options more hierarchical #213

Open MilesCranmer opened 1 year ago

MilesCranmer commented 1 year ago

The current list of options is way too long to be understood by a user. I think a refactoring should be done where new objects are used to hierarchically define the parameters.

For example, rather than have 8 parameters passed flatly for the mutation weightings, you could have a single MutationWeights object that could be passed - and would have additional documentation on what the mutation weightings are.

I think it would make sense to start by writing down a hierarchical parameter grouping, and go from there.

MilesCranmer commented 1 year ago

Draft of parameter hierarchy:

algorithm:
  search_space:
    - binary_operators
    - unary_operators
    - maxsize
    - maxdepth
  search_size:
    - niterations
    - populations
    - population_size
    - ncyclesperiteration
  mutations:
    - weight_add_node
    - weight_insert_node
    - weight_delete_node
    - weight_do_nothing
    - weight_mutate_constant
    - weight_mutate_operator
    - weight_randomize
    - weight_simplify
    - crossover_probability
    - annealing
    - alpha
    - perturbation_factor
    - skip_mutation_failures
  tournament:
    - tournament_selection_n
    - tournament_selection_p
  optimization:
    - optimizer_algorithm
    - optimizer_nrestarts
    - optimize_probability
    - optimizer_iterations
    - should_optimize_constants
  migration:
    - fraction_replaced
    - fraction_replaced_hof
    - migration
    - hof_migration
    - topn
  objective:
    - loss
    - model_selection
    complexity:
      - parsimony
      - constraints
      - nested_constraints
      - complexity_of_operators
      - complexity_of_constants
      - complexity_of_variables
      - warmup_maxsize_by
      - use_frequency
      - use_frequency_in_tournament

preprocessing:
  - denoise
  - select_k_features

stop_criteria:
  - max_evals
  - timeout_in_seconds
  - early_stop_condition

performance:
  - procs
  - multithreading
  - cluster_manager
  - batching
  - batch_size
  - precision
  - fast_cycle
  - turbo
  - random_state
  - deterministic
  - warm_start

visualization:
  - verbosity
  - update_verbosity
  - progress

environment:
  - temp_equation_file
  - tempdir
  - delete_tempfiles
  - julia_project
  - update

exporting:
  - equation_file
  - output_jax_format
  - output_torch_format
  - extra_sympy_mappings
  - extra_torch_mappings
  - extra_jax_mappings
MilesCranmer commented 1 year ago

216 Will fix this.