cjekel / piecewise_linear_fit_py

fit piecewise linear data for a specified number of line segments
MIT License
300 stars 60 forks source link

Issue using .fit() #94

Open aaronnkang opened 2 years ago

aaronnkang commented 2 years ago

Every time I try use .fit(n) to fit a pwlf with n number of segments I get the following error:

The map-like callable must be of the form f(func, iterable), returning a sequence of numbers the same length as 'iterable'

Any ideas?

cjekel commented 2 years ago

I haven't seen this before. It's possible something I use from scipy has changed.

Can you post or send me a full stack trace of the error? Just so I can see which line in pwlf is triggering this use l issue.

On Thu, Apr 14, 2022, 06:40 aaronnkang @.***> wrote:

Every time I try used .fit(n) to fit a pwlf with n number of segments I get the following error:

The map-like callable must be of the form f(func, iterable), returning a sequence of numbers the same length as 'iterable'

Any ideas?

— Reply to this email directly, view it on GitHub https://github.com/cjekel/piecewise_linear_fit_py/issues/94, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADJ5Z4L77G6XVACZ2NXGASTVFAN6FANCNFSM5TN2BDSA . You are receiving this because you are subscribed to this thread.Message ID: @.***>

cjekel commented 2 years ago

Can you tell me which version of python and scipy are you using?

On Thu, Apr 14, 2022, 07:23 Charles Jekel @.***> wrote:

I haven't seen this before. It's possible something I use from scipy has changed.

Can you post or send me a full stack trace of the error? Just so I can see which line in pwlf is triggering this use l issue.

On Thu, Apr 14, 2022, 06:40 aaronnkang @.***> wrote:

Every time I try used .fit(n) to fit a pwlf with n number of segments I get the following error:

The map-like callable must be of the form f(func, iterable), returning a sequence of numbers the same length as 'iterable'

Any ideas?

— Reply to this email directly, view it on GitHub https://github.com/cjekel/piecewise_linear_fit_py/issues/94, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADJ5Z4L77G6XVACZ2NXGASTVFAN6FANCNFSM5TN2BDSA . You are receiving this because you are subscribed to this thread.Message ID: @.***>

camjay99 commented 1 year ago

I am getting the same issue, it appears the DifferentialEvolutionSolver is generating infs/nans that cause the error. I am running Python 3.9.13 and Scipy 1.9.1.


ValueError Traceback (most recent call last) File ~\anaconda3\envs\gee\lib\site-packages\scipy\optimize_differentialevolution.py:1116, in DifferentialEvolutionSolver._calculate_population_energies(self, population) 1115 try: -> 1116 calc_energies = list( 1117 self._mapwrapper(self.func, parameters_pop[0:S]) 1118 ) 1119 calc_energies = np.squeeze(calc_energies)

File ~\anaconda3\envs\gee\lib\site-packages\scipy_lib_util.py:407, in _FunctionWrapper.call(self, x) 406 def call(self, x): --> 407 return self.f(x, *self.args)

File ~\anaconda3\envs\gee\lib\site-packages\pwlf\pwlf.py:590, in PiecewiseLinFit.fit_with_breaks_opt(self, var) 588 try: 589 # least squares solver --> 590 ssr = self.lstsq(A, calc_slopes=False) 592 except linalg.LinAlgError: 593 # the computation could not converge! 594 # on an error, return ssr = np.inf 595 # You might have a singular Matrix!!!

File ~\anaconda3\envs\gee\lib\site-packages\pwlf\pwlf.py:1491, in PiecewiseLinFit.lstsq(self, A, calcslopes) 1490 if self.weights is None: -> 1491 beta, ssr, , _ = linalg.lstsq(A, self.y_data, 1492 lapack_driver=self.lapack_driver) 1493 # ssr is only calculated if self.n_data > self.n_parameters 1494 # in this case I'll need to calculate ssr manually 1495 # where ssr = sum of square of residuals

File ~\anaconda3\envs\gee\lib\site-packages\scipy\linalg_basic.py:1135, in lstsq(a, b, cond, overwrite_a, overwrite_b, check_finite, lapack_driver) 1134 a1 = _asarray_validated(a, check_finite=check_finite) -> 1135 b1 = _asarray_validated(b, check_finite=check_finite) 1136 if len(a1.shape) != 2:

File ~\anaconda3\envs\gee\lib\site-packages\scipy_lib_util.py:287, in _asarray_validated(a, check_finite, sparse_ok, objects_ok, mask_ok, as_inexact) 286 toarray = np.asarray_chkfinite if check_finite else np.asarray --> 287 a = toarray(a) 288 if not objects_ok:

File ~\anaconda3\envs\gee\lib\site-packages\numpy\lib\function_base.py:627, in asarray_chkfinite(a, dtype, order) 626 if a.dtype.char in typecodes['AllFloat'] and not np.isfinite(a).all(): --> 627 raise ValueError( 628 "array must not contain infs or NaNs") 629 return a

ValueError: array must not contain infs or NaNs

The above exception was the direct cause of the following exception:

RuntimeError Traceback (most recent call last) Cell In [31], line 1 ----> 1 evi = process_point(all_data, 54, 1)

Cell In [25], line 30, in process_point(dataset, latitude, longitude) 27 print(df) 29 # Fits a piecewise linear regression to data ---> 30 model = fit_pwlf(df, break_guesses) 32 # Calculate anomaly 33 df['EVI_pred'] = model.predict(df['doy_nrml'])

Cell In [27], line 60, in fit_pwlf(df, break_guesses) 58 def fit_pwlf(df, break_guesses): 59 model = pwlf.PiecewiseLinFit(df['doy_nrml'], df['EVI_nrml']) ---> 60 model.fit(8)#fit_guess(break_guesses) 61 return model

File ~\anaconda3\envs\gee\lib\site-packages\pwlf\pwlf.py:771, in PiecewiseLinFit.fit(self, n_segments, x_c, y_c, bounds, kwargs) 769 # run the optimization 770 if len(kwargs) == 0: --> 771 res = differential_evolution(min_function, bounds, 772 strategy='best1bin', maxiter=1000, 773 popsize=50, tol=1e-3, 774 mutation=(0.5, 1), recombination=0.7, 775 seed=None, callback=None, disp=False, 776 polish=True, init='latinhypercube', 777 atol=1e-4) 778 else: 779 res = differential_evolution(min_function, 780 bounds, kwargs)

File ~\anaconda3\envs\gee\lib\site-packages\scipy\optimize_differentialevolution.py:392, in differential_evolution(func, bounds, args, strategy, maxiter, popsize, tol, mutation, recombination, seed, callback, disp, polish, init, atol, updating, workers, constraints, x0, integrality, vectorized) 375 # using a context manager means that any created Pool objects are 376 # cleared up. 377 with DifferentialEvolutionSolver(func, bounds, args=args, 378 strategy=strategy, 379 maxiter=maxiter, (...) 390 integrality=integrality, 391 vectorized=vectorized) as solver: --> 392 ret = solver.solve() 394 return ret

File ~\anaconda3\envs\gee\lib\site-packages\scipy\optimize_differentialevolution.py:984, in DifferentialEvolutionSolver.solve(self) 979 self.feasible, self.constraint_violation = ( 980 self._calculate_population_feasibilities(self.population)) 982 # only work out population energies for feasible solutions 983 self.population_energies[self.feasible] = ( --> 984 self._calculate_population_energies( 985 self.population[self.feasible])) 987 self._promote_lowest_energy() 989 # do the optimization.

File ~\anaconda3\envs\gee\lib\site-packages\scipy\optimize_differentialevolution.py:1123, in DifferentialEvolutionSolver._calculate_population_energies(self, population) 1119 calc_energies = np.squeeze(calc_energies) 1120 except (TypeError, ValueError) as e: 1121 # wrong number of arguments for _mapwrapper 1122 # or wrong length returned from the mapper -> 1123 raise RuntimeError( 1124 "The map-like callable must be of the form f(func, iterable), " 1125 "returning a sequence of numbers the same length as 'iterable'" 1126 ) from e 1128 if calc_energies.size != S: 1129 if self.vectorized:

RuntimeError: The map-like callable must be of the form f(func, iterable), returning a sequence of numbers the same length as 'iterable'

cjekel commented 1 year ago

Thanks for that full traceback. I think I know what the issue is, it's this block of code in the lstsq fit

        try:
            ssr = self.lstsq(A)

        except linalg.LinAlgError:
            # the computation could not converge!
            # on an error, return ssr = np.print_function
            # You might have a singular Matrix!!!
            ssr = np.inf
        if ssr is None:
            ssr = np.inf
            # something went wrong...
        self.ssr = ssr

The new version of DE seems not to be able to handle np.inf. I have not been following how DE has been changing in scipy. It has gone through several changes throughout the years. It might be a bug with scipy's DE if this used to be supported.

As a hotfix, can you try using a very very big number instead of np.inf?

camjay99 commented 1 year ago

Sorry for the false alarm, after further exploration, I discovered my preprocessing was generating a NaN that was causing the issue. The code appears to work fine as is.