cavalab / ellyn

python-wrapped version of ellen, a linear genetic programming system for symbolic regression and classification.
http://cavalab.org/ellyn
Other
54 stars 11 forks source link

Example of the seeds parameter #4

Closed dzubo closed 6 years ago

dzubo commented 6 years ago

Thank you for the work, it's amazing!

Could you please elaborate on parameter seeds?

Seed GP initialization with partial solutions, e.g. (x+y). Each partial solution must be enclosed in parentheses.

But when I use the code like this

learner = ellyn(g=2, popsize=200, verbosity=2, num_islands=2,
                scoring_function='r2', max_len=20, seeds='(x+y)')

or this

learner = ellyn(g=2, popsize=200, verbosity=2, num_islands=2,
                scoring_function='r2', max_len=20, seeds='((x_4)*(x_67))')

I get this error message:

Traceback (most recent call last):
  File "train_ellyn.py", line 18, in <module>
    learner.fit(X_train, y_train)
  File "/opt/notebooks/ellyn.py", line 213, in fit
    print('best model:',self.stack_2_eqn(self.best_estimator_))
  File "/opt/notebooks/ellyn.py", line 383, in stack_2_eqn
    return stack_eqn[-1]
IndexError: list index out of range
lacava commented 6 years ago

Thank you for the work, it's amazing!

Thanks!

Hmmm, I have not done much work with the seeding feature in python. Is there any chance you can post a minimal working example that I can debug to figure out what the problem is? Thanks for reporting this bug.

dzubo commented 6 years ago

Here is the example:

import numpy as np

from sklearn.datasets import make_regression
from ellyn import ellyn

X, y = make_regression(n_samples=100, n_features=10)

learner = ellyn(g=10, popsize=100, verbosity=2, seeds='(x_1+x_2)')
learner.fit(X, y)

The output:

==========
params
==========
g : 10
seeds : (x_1+x_2)
popsize : 100
verbosity : 2
classification : False
scoring_function : <function mean_squared_error at 0x7fd04a3acd08>
random_state : 0
selection : tournament
best_estimator_ : []
hof : []
return_pop : False
class_m4gp : False
sel : 1
{'g': 10, 'seeds': '(x_1+x_2)', 'popsize': 100, 'verbosity': 2, 'classification': False, 'scoring_function': <function mean_squared_error at 0x7fd04a3acd08>, 'random_state': 0, 'selection': 'tournament', 'best_estimator_': [], 'hof': [], 'return_pop': False, 'class_m4gp': False, 'sel': 1}
op2node failed.
stoffinal model(s):
Traceback (most recent call last):
  File "train_ellyn_seeds.py", line 9, in <module>
    learner.fit(X, y)
  File "/opt/notebooks/ellyn.py", line 214, in fit
    print('best model:',self.stack_2_eqn(self.best_estimator_))
  File "/opt/notebooks/ellyn.py", line 384, in stack_2_eqn
    return stack_eqn[-1]
IndexError: list index out of range
lacava commented 6 years ago

checkout https://github.com/EpistasisLab/ellyn/commit/ad1692ca7b409cd001047550089fef9fbfdedacd

the issue was that the seeds needed to be converted to a list in ellyn.py . I ran it with the example fine. Let me know if this works for you.

lacava commented 6 years ago

closing this based on example fix