Enhance the gplearn package to support precise three-dimensional structured dimension genetic programming (GP), with a particular focus on enabling cross-sectional factor analysis within the package.
Eliminated unused components such as classifier and regressor.
Removed the show_program function.
Enhanced Function Handling:
Implemented checks to avoid scenarios where func(x, y, z) accepts the same variable multiple times.
New Features
Factor Depth Statistics:
Added functionality to perform in-depth statistical analysis of factors.
Mutation Probability Mechanism Improvements
Revised Mutation Probability Mechanism:
Refined the understanding and definition of various mutation probabilities.
Previous Version: The point_replace controlled the likelihood of a point mutation for each point. The probability of no change was implicitly defined as 1 - (sum of four mutation probabilities).
Upgrade: Explicitly defined the probability of no change, allowing the total probability to potentially exceed 1, emphasizing the relative importance of each probability.
This upgrade facilitates dynamic adjustment of the mutation probabilities.
BaseIC and Factor Correlation Adjustment
BaseIC Optimization:
Integrated satisfied_factors into a DataFrame for the final day and computed the correlation of each candidate factor with the factor set.
If the maximum correlation exceeds 0.8, the parsimony-adjusted fitness is scaled by (1 - corr + 0.6) before participating in the next generation's competition.
Dynamic Mutation Probability Adjustment
Adaptive Mutation Probability:
Dynamically adjusted the probabilities of the four mutation types based on the structure of _program.program.
Point Mutation Probability: Increased if the same terminal appears excessively.
Hoist Mutation Probability: Increased if the depth is too large.
Point and Hoist Mutation Probability: Increased if the program length is too long.
Version 2024-08-20
Code Optimization
Removed Unnecessary Code:
classifier
andregressor
.show_program
function.Enhanced Function Handling:
func(x, y, z)
accepts the same variable multiple times.New Features
Mutation Probability Mechanism Improvements
point_replace
controlled the likelihood of a point mutation for each point. The probability of no change was implicitly defined as1 - (sum of four mutation probabilities)
.BaseIC and Factor Correlation Adjustment
satisfied_factors
into a DataFrame for the final day and computed the correlation of each candidate factor with the factor set.(1 - corr + 0.6)
before participating in the next generation's competition.Dynamic Mutation Probability Adjustment
_program.program
.