trevorstephens / gplearn

Genetic Programming in Python, with a scikit-learn inspired API
http://gplearn.readthedocs.io/
BSD 3-Clause "New" or "Revised" License
1.56k stars 274 forks source link

const_range error #283

Closed geli666 closed 1 year ago

geli666 commented 1 year ago

Describe the bug

if I set the parameter--const_range as (-1.0,1.0), then when I use mdl.fit(X_train, y_train.values), a error appeared(AttributeError: 'list' object has no attribute 'shape'). """ Traceback (most recent call last): File "/opt/anaconda3/lib/python3.8/site-packages/joblib/externals/loky/process_executor.py", line 436, in _process_worker r = call_item() File "/opt/anaconda3/lib/python3.8/site-packages/joblib/externals/loky/process_executor.py", line 288, in __call__ return self.fn(*self.args, **self.kwargs) File "/opt/anaconda3/lib/python3.8/site-packages/joblib/_parallel_backends.py", line 595, in __call__ return self.func(*args, **kwargs) File "/opt/anaconda3/lib/python3.8/site-packages/joblib/parallel.py", line 262, in __call__ return [func(*args, **kwargs) File "/opt/anaconda3/lib/python3.8/site-packages/joblib/parallel.py", line 262, in return [func(*args, **kwargs) File "/data/userdata/gp_feature_engineering-master/gplearn/genetic.py", line 147, in _parallel_evolve program.raw_fitness_ = program.raw_fitness(X, y, curr_sample_weight) File "/data/userdata/gp_feature_engineering-master/gplearn/_program.py", line 465, in raw_fitness y_pred = self.execute(X) File "/data/userdata/gp_feature_engineering-master/gplearn/_program.py", line 362, in execute return np.repeat(node*np.ones(X.shape)) AttributeError: 'list' object has no attribute 'shape' """ If I set the parameter--const_range as None, it runs normally. So I'd like to ask for help, what's the reason of this error X_train is a list,with a shape of (5, 1461, 1942) y_train is a dataframe,with a shape of (1461, 1942) if I add X_train = np.array(X_train),there will be a new error, TypeError: _repeat_dispatcher() missing 1 required positional argument: 'repeats'. X_train[0] output: array([[ nan, 366.74, nan, ..., nan, nan, nan], [ nan, 308.82, nan, ..., nan, nan, nan], [ nan, 321.94, nan, ..., nan, nan, nan], ..., [ nan, nan, nan, ..., nan, nan, nan], [ nan, nan, nan, ..., nan, nan, nan], [ nan, nan, nan, ..., nan, nan, nan]]) **Expected behavior**

Actual behavior

Steps to reproduce the behavior

System information

Linux-5.14.15-1.el8.elrepo.x86_64-x86_64-with-glibc2.10 Python 3.8.3 (default, Jul 2 2020, 16:21:59) [GCC 7.3.0] NumPy 1.19.3 SciPy 1.6.3 Scikit-Learn 0.24.2 Joblib 1.1.0 gplearn 0.4.1

trevorstephens commented 1 year ago

What data type is your X_train? Can you recreate the issue with something small and self contained?

geli666 commented 1 year ago

What data type is your X_train? Can you recreate the issue with something small and self contained?

thank you for help! I have recreated the issue.

trevorstephens commented 1 year ago

So your X is just a list of arrays? That won't work. You need y to be a one dimensional array and X to be two dimensional.

geli666 commented 1 year ago

Thank you for help! Because I reset operator so that the model could accept the 3-dimensional data through adding a cycle. Therefore, for the first error I faced, could I change the type of data in gplearn/_program.py directly?

trevorstephens commented 1 year ago

It's not something that supported out of the box, so yes, you would have to change the source code.

geli666 commented 1 year ago

It's not something that supported out of the box, so yes, you would have to change the source code. Thank you for your assistance with this matter!