Open charishma13 opened 2 weeks ago
This feature does not yet exist, but it would certainly be nice to add it or simplify existing alternatives. The current strategy is basically to initialise the state manually. Alternatively you could run a search for 1 iteration, and then manipulate the saved state to specify individual members of the population. On the PySR discussions page there are some threads about this too.
Thank you for the suggestion @MilesCranmer. I will check the documentation and do the respective changes. I would also like to know in which Julia file does the actual initialization of population happens for every PySR iteration ?
The initialisation function is here: https://github.com/MilesCranmer/SymbolicRegression.jl/blob/master/src/Population.jl#L36-L62
which gets called here: https://github.com/MilesCranmer/SymbolicRegression.jl/blob/cd23a6e25c64d00565c3ae3905d06dc3c63033ed/src/SymbolicRegression.jl#L775
I am currently facing challenges in creating a custom saved_state. The saved_state is a tuple consisting of a population and a hall of fame object. I am in the process of developing a custom implementation for both the population and the hall of fame. To date, I have successfully created the PopMember component, following the guidance provided in the discussion available at https://github.com/MilesCranmer/PySR/discussions/443. I am attempting to create a population using PopMember instances, and I was considering calling the struct directly for this purpose. However, I am unsure if this approach will work as intended. I am encountering errors with the following code in highlighted line.
using .SymbolicRegression: Node, Options, equation_search, Dataset, PopMember, HallOfFame, Population
using CSV
using DataFrames
val = Node{Float64}(val=162.0)
xsi = Node{Float64}(val=1.224f0)
options = Options(binary_operators=[+, -, *, /])
csv_file_path = "water_water.csv"
data = CSV.File(csv_file_path) |> DataFrame
X1 = reshape(data."Angle", 1, :)
X2 = reshape(data."OH1", 1, :)
X3 = reshape(data."OH2", 1, :)
X4 = reshape(data."H1H2", 1, :)
X = [X1 X2 X3 X4]
X = reshape(X, 4, :)
y = data."Energy"
# Assuming y is your target variable
y_min = minimum(y)
y_scaled = (y .- y_min) * 2625.5002
dataset = Dataset(X, y_scaled)
# Format to PopMember:
member = PopMember(dataset, val, options; deterministic=false)
member1 = PopMember(dataset, xsi, options; deterministic=false)
>> population = Population{Float32, Float64, Node{Float32}}([member, member1], 2)
ERROR
ERROR: LoadError: TypeError: in Population, in L, expected L<:Real, got a value of type Float64 Stacktrace: [1] top-level scope @ ~/LU_Exp/popmembers_hof.jl:77
Hello @MilesCranmer,
I have managed to populate the Population using the following code: Population{Float32, Float32, Node{Float32}}([member, member1], 2)
I would like to inquire about where the initialization begins within the SymbolicRegression.jl framework, particularly with respect to functions such as _main_search_loop, _warmup_search, _initialize_search, and _create_workers.
Our intention is to modify the process starting from the initial population phase, allowing PySR to search for equations based on a predefined expression given. Could you kindly clarify which function is responsible for calling the Population struct and initiating its initialization?. Additionally, is it possible to adjust the complexity, such that the search begins with a higher value, for instance, 7 or 9, instead of the default starting point of 1 (a float value)?
I would like to know how to initialize my population with n members which have pre-specified structure. For example, if i want my initiate population to have 15 members all of which have same expression eg: 1+x. Are there Pysr options to do it or is it something need to be updated. Thank you.