predict-idlab / powershap

A power-full Shapley feature selection method.
Other
196 stars 19 forks source link

ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (339,) + inhomogeneous part #46

Open jeremyhermann opened 6 months ago

jeremyhermann commented 6 months ago

Do you know why I'd get this error when running PowerShap?

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[95], [line 16](vscode-notebook-cell:?execution_count=95&line=16)
      [5](vscode-notebook-cell:?execution_count=95&line=5) # X, y = ...  # your classification dataset
      [6](vscode-notebook-cell:?execution_count=95&line=6) 
      [7](vscode-notebook-cell:?execution_count=95&line=7) # selector = PowerShap(
      [8](vscode-notebook-cell:?execution_count=95&line=8) #     model=LGBMRegressor(n_estimators=1000, verbose=1)
      [9](vscode-notebook-cell:?execution_count=95&line=9) # )
     [11](vscode-notebook-cell:?execution_count=95&line=11) selector = PowerShap(
     [12](vscode-notebook-cell:?execution_count=95&line=12)     model=CatBoostRegressor(n_estimators=250, verbose=0, use_best_model=True),
     [13](vscode-notebook-cell:?execution_count=95&line=13)     power_iterations=2
     [14](vscode-notebook-cell:?execution_count=95&line=14) )
---> [16](vscode-notebook-cell:?execution_count=95&line=16) selector.fit(X, y)  # Fit the PowerShap feature selector

File [~/mambaforge/envs/delphina-beta-env/lib/python3.10/site-packages/powershap/powershap.py:392](https://vscode-remote+ssh-002dremote-002bdelphina-002djeremy.vscode-resource.vscode-cdn.net/home/ubuntu/delphina-beta/python/notebooks/gold/nyctaxi/~/mambaforge/envs/delphina-beta-env/lib/python3.10/site-packages/powershap/powershap.py:392), in PowerShap.fit(self, X, y, stratify, groups, **kwargs)
    [375](https://vscode-remote+ssh-002dremote-002bdelphina-002djeremy.vscode-resource.vscode-cdn.net/home/ubuntu/delphina-beta/python/notebooks/gold/nyctaxi/~/mambaforge/envs/delphina-beta-env/lib/python3.10/site-packages/powershap/powershap.py:375)     self._print(
    [376](https://vscode-remote+ssh-002dremote-002bdelphina-002djeremy.vscode-resource.vscode-cdn.net/home/ubuntu/delphina-beta/python/notebooks/gold/nyctaxi/~/mambaforge/envs/delphina-beta-env/lib/python3.10/site-packages/powershap/powershap.py:376)         "Automatic mode enabled: Finding the minimal required powershap",
    [377](https://vscode-remote+ssh-002dremote-002bdelphina-002djeremy.vscode-resource.vscode-cdn.net/home/ubuntu/delphina-beta/python/notebooks/gold/nyctaxi/~/mambaforge/envs/delphina-beta-env/lib/python3.10/site-packages/powershap/powershap.py:377)         f"iterations for significance of {self.power_alpha}.",
    [378](https://vscode-remote+ssh-002dremote-002bdelphina-002djeremy.vscode-resource.vscode-cdn.net/home/ubuntu/delphina-beta/python/notebooks/gold/nyctaxi/~/mambaforge/envs/delphina-beta-env/lib/python3.10/site-packages/powershap/powershap.py:378)     )
    [380](https://vscode-remote+ssh-002dremote-002bdelphina-002djeremy.vscode-resource.vscode-cdn.net/home/ubuntu/delphina-beta/python/notebooks/gold/nyctaxi/~/mambaforge/envs/delphina-beta-env/lib/python3.10/site-packages/powershap/powershap.py:380) shaps_df = self._explainer.explain(
    [381](https://vscode-remote+ssh-002dremote-002bdelphina-002djeremy.vscode-resource.vscode-cdn.net/home/ubuntu/delphina-beta/python/notebooks/gold/nyctaxi/~/mambaforge/envs/delphina-beta-env/lib/python3.10/site-packages/powershap/powershap.py:381)     X=X,
    [382](https://vscode-remote+ssh-002dremote-002bdelphina-002djeremy.vscode-resource.vscode-cdn.net/home/ubuntu/delphina-beta/python/notebooks/gold/nyctaxi/~/mambaforge/envs/delphina-beta-env/lib/python3.10/site-packages/powershap/powershap.py:382)     y=y,
   (...)
    [389](https://vscode-remote+ssh-002dremote-002bdelphina-002djeremy.vscode-resource.vscode-cdn.net/home/ubuntu/delphina-beta/python/notebooks/gold/nyctaxi/~/mambaforge/envs/delphina-beta-env/lib/python3.10/site-packages/powershap/powershap.py:389)     **kwargs,
    [390](https://vscode-remote+ssh-002dremote-002bdelphina-002djeremy.vscode-resource.vscode-cdn.net/home/ubuntu/delphina-beta/python/notebooks/gold/nyctaxi/~/mambaforge/envs/delphina-beta-env/lib/python3.10/site-packages/powershap/powershap.py:390) )
--> [392](https://vscode-remote+ssh-002dremote-002bdelphina-002djeremy.vscode-resource.vscode-cdn.net/home/ubuntu/delphina-beta/python/notebooks/gold/nyctaxi/~/mambaforge/envs/delphina-beta-env/lib/python3.10/site-packages/powershap/powershap.py:392) processed_shaps_df = powerSHAP_statistical_analysis(
    [393](https://vscode-remote+ssh-002dremote-002bdelphina-002djeremy.vscode-resource.vscode-cdn.net/home/ubuntu/delphina-beta/python/notebooks/gold/nyctaxi/~/mambaforge/envs/delphina-beta-env/lib/python3.10/site-packages/powershap/powershap.py:393)     shaps_df, self.power_alpha, self.power_req_iterations, include_all=self.include_all
    [394](https://vscode-remote+ssh-002dremote-002bdelphina-002djeremy.vscode-resource.vscode-cdn.net/home/ubuntu/delphina-beta/python/notebooks/gold/nyctaxi/~/mambaforge/envs/delphina-beta-env/lib/python3.10/site-packages/powershap/powershap.py:394) )
    [396](https://vscode-remote+ssh-002dremote-002bdelphina-002djeremy.vscode-resource.vscode-cdn.net/home/ubuntu/delphina-beta/python/notebooks/gold/nyctaxi/~/mambaforge/envs/delphina-beta-env/lib/python3.10/site-packages/powershap/powershap.py:396) if self.automatic:
    [398](https://vscode-remote+ssh-002dremote-002bdelphina-002djeremy.vscode-resource.vscode-cdn.net/home/ubuntu/delphina-beta/python/notebooks/gold/nyctaxi/~/mambaforge/envs/delphina-beta-env/lib/python3.10/site-packages/powershap/powershap.py:398)     processed_shaps_df = self._automatic_fit(
    [399](https://vscode-remote+ssh-002dremote-002bdelphina-002djeremy.vscode-resource.vscode-cdn.net/home/ubuntu/delphina-beta/python/notebooks/gold/nyctaxi/~/mambaforge/envs/delphina-beta-env/lib/python3.10/site-packages/powershap/powershap.py:399)         X=X,
    [400](https://vscode-remote+ssh-002dremote-002bdelphina-002djeremy.vscode-resource.vscode-cdn.net/home/ubuntu/delphina-beta/python/notebooks/gold/nyctaxi/~/mambaforge/envs/delphina-beta-env/lib/python3.10/site-packages/powershap/powershap.py:400)         y=y,
   (...)
    [406](https://vscode-remote+ssh-002dremote-002bdelphina-002djeremy.vscode-resource.vscode-cdn.net/home/ubuntu/delphina-beta/python/notebooks/gold/nyctaxi/~/mambaforge/envs/delphina-beta-env/lib/python3.10/site-packages/powershap/powershap.py:406)         **kwargs,
    [407](https://vscode-remote+ssh-002dremote-002bdelphina-002djeremy.vscode-resource.vscode-cdn.net/home/ubuntu/delphina-beta/python/notebooks/gold/nyctaxi/~/mambaforge/envs/delphina-beta-env/lib/python3.10/site-packages/powershap/powershap.py:407)     )

File [~/mambaforge/envs/delphina-beta-env/lib/python3.10/site-packages/powershap/utils.py:72](https://vscode-remote+ssh-002dremote-002bdelphina-002djeremy.vscode-resource.vscode-cdn.net/home/ubuntu/delphina-beta/python/notebooks/gold/nyctaxi/~/mambaforge/envs/delphina-beta-env/lib/python3.10/site-packages/powershap/utils.py:72), in powerSHAP_statistical_analysis(shaps_df, power_alpha, power_req_iterations, include_all)
     [62](https://vscode-remote+ssh-002dremote-002bdelphina-002djeremy.vscode-resource.vscode-cdn.net/home/ubuntu/delphina-beta/python/notebooks/gold/nyctaxi/~/mambaforge/envs/delphina-beta-env/lib/python3.10/site-packages/powershap/utils.py:62)         effect_size.append(0)
     [63](https://vscode-remote+ssh-002dremote-002bdelphina-002djeremy.vscode-resource.vscode-cdn.net/home/ubuntu/delphina-beta/python/notebooks/gold/nyctaxi/~/mambaforge/envs/delphina-beta-env/lib/python3.10/site-packages/powershap/utils.py:63)         power_list.append(0)
     [65](https://vscode-remote+ssh-002dremote-002bdelphina-002djeremy.vscode-resource.vscode-cdn.net/home/ubuntu/delphina-beta/python/notebooks/gold/nyctaxi/~/mambaforge/envs/delphina-beta-env/lib/python3.10/site-packages/powershap/utils.py:65) processed_shaps_df = pd.DataFrame(
     [66](https://vscode-remote+ssh-002dremote-002bdelphina-002djeremy.vscode-resource.vscode-cdn.net/home/ubuntu/delphina-beta/python/notebooks/gold/nyctaxi/~/mambaforge/envs/delphina-beta-env/lib/python3.10/site-packages/powershap/utils.py:66)     data=np.hstack(
     [67](https://vscode-remote+ssh-002dremote-002bdelphina-002djeremy.vscode-resource.vscode-cdn.net/home/ubuntu/delphina-beta/python/notebooks/gold/nyctaxi/~/mambaforge/envs/delphina-beta-env/lib/python3.10/site-packages/powershap/utils.py:67)         [
     [68](https://vscode-remote+ssh-002dremote-002bdelphina-002djeremy.vscode-resource.vscode-cdn.net/home/ubuntu/delphina-beta/python/notebooks/gold/nyctaxi/~/mambaforge/envs/delphina-beta-env/lib/python3.10/site-packages/powershap/utils.py:68)             np.reshape(shaps_df.mean().values, (-1, 1)),
     [69](https://vscode-remote+ssh-002dremote-002bdelphina-002djeremy.vscode-resource.vscode-cdn.net/home/ubuntu/delphina-beta/python/notebooks/gold/nyctaxi/~/mambaforge/envs/delphina-beta-env/lib/python3.10/site-packages/powershap/utils.py:69)             np.reshape(np.array(p_values), (len(p_values), 1)),
     [70](https://vscode-remote+ssh-002dremote-002bdelphina-002djeremy.vscode-resource.vscode-cdn.net/home/ubuntu/delphina-beta/python/notebooks/gold/nyctaxi/~/mambaforge/envs/delphina-beta-env/lib/python3.10/site-packages/powershap/utils.py:70)             np.reshape(np.array(effect_size), (len(effect_size), 1)),
     [71](https://vscode-remote+ssh-002dremote-002bdelphina-002djeremy.vscode-resource.vscode-cdn.net/home/ubuntu/delphina-beta/python/notebooks/gold/nyctaxi/~/mambaforge/envs/delphina-beta-env/lib/python3.10/site-packages/powershap/utils.py:71)             np.reshape(np.array(power_list), (len(power_list), 1)),
---> [72](https://vscode-remote+ssh-002dremote-002bdelphina-002djeremy.vscode-resource.vscode-cdn.net/home/ubuntu/delphina-beta/python/notebooks/gold/nyctaxi/~/mambaforge/envs/delphina-beta-env/lib/python3.10/site-packages/powershap/utils.py:72)             np.reshape(np.array(required_iterations), (len(required_iterations), 1)),
     [73](https://vscode-remote+ssh-002dremote-002bdelphina-002djeremy.vscode-resource.vscode-cdn.net/home/ubuntu/delphina-beta/python/notebooks/gold/nyctaxi/~/mambaforge/envs/delphina-beta-env/lib/python3.10/site-packages/powershap/utils.py:73)         ]
     [74](https://vscode-remote+ssh-002dremote-002bdelphina-002djeremy.vscode-resource.vscode-cdn.net/home/ubuntu/delphina-beta/python/notebooks/gold/nyctaxi/~/mambaforge/envs/delphina-beta-env/lib/python3.10/site-packages/powershap/utils.py:74)     ),
     [75](https://vscode-remote+ssh-002dremote-002bdelphina-002djeremy.vscode-resource.vscode-cdn.net/home/ubuntu/delphina-beta/python/notebooks/gold/nyctaxi/~/mambaforge/envs/delphina-beta-env/lib/python3.10/site-packages/powershap/utils.py:75)     columns=[
     [76](https://vscode-remote+ssh-002dremote-002bdelphina-002djeremy.vscode-resource.vscode-cdn.net/home/ubuntu/delphina-beta/python/notebooks/gold/nyctaxi/~/mambaforge/envs/delphina-beta-env/lib/python3.10/site-packages/powershap/utils.py:76)         "impact",
     [77](https://vscode-remote+ssh-002dremote-002bdelphina-002djeremy.vscode-resource.vscode-cdn.net/home/ubuntu/delphina-beta/python/notebooks/gold/nyctaxi/~/mambaforge/envs/delphina-beta-env/lib/python3.10/site-packages/powershap/utils.py:77)         "p_value",
     [78](https://vscode-remote+ssh-002dremote-002bdelphina-002djeremy.vscode-resource.vscode-cdn.net/home/ubuntu/delphina-beta/python/notebooks/gold/nyctaxi/~/mambaforge/envs/delphina-beta-env/lib/python3.10/site-packages/powershap/utils.py:78)         "effect_size",
     [79](https://vscode-remote+ssh-002dremote-002bdelphina-002djeremy.vscode-resource.vscode-cdn.net/home/ubuntu/delphina-beta/python/notebooks/gold/nyctaxi/~/mambaforge/envs/delphina-beta-env/lib/python3.10/site-packages/powershap/utils.py:79)         "power_" + str(power_alpha) + "_alpha",
     [80](https://vscode-remote+ssh-002dremote-002bdelphina-002djeremy.vscode-resource.vscode-cdn.net/home/ubuntu/delphina-beta/python/notebooks/gold/nyctaxi/~/mambaforge/envs/delphina-beta-env/lib/python3.10/site-packages/powershap/utils.py:80)         str(power_req_iterations) + "_power_its_req",
     [81](https://vscode-remote+ssh-002dremote-002bdelphina-002djeremy.vscode-resource.vscode-cdn.net/home/ubuntu/delphina-beta/python/notebooks/gold/nyctaxi/~/mambaforge/envs/delphina-beta-env/lib/python3.10/site-packages/powershap/utils.py:81)     ],
     [82](https://vscode-remote+ssh-002dremote-002bdelphina-002djeremy.vscode-resource.vscode-cdn.net/home/ubuntu/delphina-beta/python/notebooks/gold/nyctaxi/~/mambaforge/envs/delphina-beta-env/lib/python3.10/site-packages/powershap/utils.py:82)     index=shaps_df.mean().index,
     [83](https://vscode-remote+ssh-002dremote-002bdelphina-002djeremy.vscode-resource.vscode-cdn.net/home/ubuntu/delphina-beta/python/notebooks/gold/nyctaxi/~/mambaforge/envs/delphina-beta-env/lib/python3.10/site-packages/powershap/utils.py:83) )
     [84](https://vscode-remote+ssh-002dremote-002bdelphina-002djeremy.vscode-resource.vscode-cdn.net/home/ubuntu/delphina-beta/python/notebooks/gold/nyctaxi/~/mambaforge/envs/delphina-beta-env/lib/python3.10/site-packages/powershap/utils.py:84) processed_shaps_df = processed_shaps_df.reindex(
     [85](https://vscode-remote+ssh-002dremote-002bdelphina-002djeremy.vscode-resource.vscode-cdn.net/home/ubuntu/delphina-beta/python/notebooks/gold/nyctaxi/~/mambaforge/envs/delphina-beta-env/lib/python3.10/site-packages/powershap/utils.py:85)     processed_shaps_df.impact.abs().sort_values(ascending=False).index
     [86](https://vscode-remote+ssh-002dremote-002bdelphina-002djeremy.vscode-resource.vscode-cdn.net/home/ubuntu/delphina-beta/python/notebooks/gold/nyctaxi/~/mambaforge/envs/delphina-beta-env/lib/python3.10/site-packages/powershap/utils.py:86) )
     [88](https://vscode-remote+ssh-002dremote-002bdelphina-002djeremy.vscode-resource.vscode-cdn.net/home/ubuntu/delphina-beta/python/notebooks/gold/nyctaxi/~/mambaforge/envs/delphina-beta-env/lib/python3.10/site-packages/powershap/utils.py:88) return processed_shaps_df

ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (339,) + inhomogeneous part.
jvdd commented 6 months ago

Hi @jeremyhermann,

Thanks for submitting this issue. If you can provide a (minimal) reproducible example, I'll gladly further analyze why this error occurs.

mrzdev commented 6 months ago

Hi,

chiming in, I observed the same issue occur a while ago using PowerShap with Optuna .

I tracked it down to _TTestPower().solvepower call in utils.py issuing a warning site-packages/statsmodels/stats/power.py:525: ConvergenceWarning: Failed to converge on a solution. and returning a list instead of a scalar here:

https://github.com/predict-idlab/powershap/blob/4a60fbe79d67d311693e4c9a8616f81d652f2bb4/powershap/utils.py?plain=1#L51

In that case, after appending the result to _requirediterations list, we end up with inhomogeneous shape and it leads to an error during conversion of _requirediterations to numpy array here: https://github.com/predict-idlab/powershap/blob/4a60fbe79d67d311693e4c9a8616f81d652f2bb4/powershap/utils.py?plain=1#L72

Logging the _solvepower returns in each iteration:

[10.]
2.4867261850681004
2.6408546509597652
...

Here's a repro from trial that errored out for me. The solution is maybe simply unpacking first value invariant if a scalar or a list: np.asarray(solved_power).flatten()[0] before appending. Let me know what you think.

jonasvdd commented 5 months ago

@jvdd @JarneVerhaeghe - this appears highly similar to the error that I obtained when updating the dependencies.

hentt30 commented 4 months ago

I'm having the same error since upgrading to NumPy 1.24.1. Looks like it was caused by a breaking change in NumPy.

https://numpy.org/neps/nep-0034-infer-dtype-is-object.html

ppawlo97 commented 1 month ago

I'm still experiencing the same error, would it be possible to adjust the code to make it compatible with later versions of NumPy?