amphibian-dev / toad

ESC Team's credit scorecard tools.
https://toad.readthedocs.io
MIT License
481 stars 174 forks source link

There is a problem about toad.quality #120

Open ivring2 opened 1 year ago

ivring2 commented 1 year ago

My code:

import toad
iv_info = toad.quality(df_train,'isDefault', iv_only=True)

I would be grateful if you could help me solve this problem


_RemoteTraceback Traceback (most recent call last) _RemoteTraceback: """ Traceback (most recent call last): File "/data/anaconda3/envs/bxfqz/lib/python3.8/site-packages/joblib/externals/loky/process_executor.py", line 428, in _process_worker r = call_item() File "/data/anaconda3/envs/bxfqz/lib/python3.8/site-packages/joblib/externals/loky/process_executor.py", line 275, in call return self.fn(*self.args, self.kwargs) File "/data/anaconda3/envs/bxfqz/lib/python3.8/site-packages/joblib/_parallel_backends.py", line 620, in call return self.func(*args, *kwargs) File "/data/anaconda3/envs/bxfqz/lib/python3.8/site-packages/joblib/parallel.py", line 288, in call return [func(args, kwargs) File "/data/anaconda3/envs/bxfqz/lib/python3.8/site-packages/joblib/parallel.py", line 288, in return [func(*args, kwargs) File "/data/anaconda3/envs/bxfqz/lib/python3.8/site-packages/toad/stats.py", line 325, in column_quality res[func.name] = func(bin_feature, target) File "/data/anaconda3/envs/bxfqz/lib/python3.8/site-packages/toad/utils/decorator.py", line 46, in call return self.wrapper(*args, *kwargs) File "/data/anaconda3/envs/bxfqz/lib/python3.8/site-packages/toad/stats.py", line 278, in wrapper return self.call(args, kwargs) File "/data/anaconda3/envs/bxfqz/lib/python3.8/site-packages/toad/utils/decorator.py", line 75, in call return self.fn(*args, kwargs) File "/data/anaconda3/envs/bxfqz/lib/python3.8/site-packages/toad/utils/decorator.py", line 46, in call return self.wrapper(*args, *kwargs) File "/data/anaconda3/envs/bxfqz/lib/python3.8/site-packages/toad/utils/decorator.py", line 140, in wrapper return self.call(frame, args, kwargs) File "/data/anaconda3/envs/bxfqz/lib/python3.8/site-packages/toad/utils/decorator.py", line 75, in call return self.fn(*args, **kwargs) File "/data/anaconda3/envs/bxfqz/lib/python3.8/site-packages/toad/stats.py", line 215, in IV iv, sub = _IV(feature, target) File "/data/anaconda3/envs/bxfqz/lib/python3.8/site-packages/toad/stats.py", line 191, in _IV y_prob, n_prob = probability(target, mask = (feature == v)) File "/data/anaconda3/envs/bxfqz/lib/python3.8/site-packages/toad/stats.py", line 147, in probability counts_0 = np_count(target, 0, default = 1) File "/data/anaconda3/envs/bxfqz/lib/python3.8/site-packages/toad/utils/func.py", line 57, in np_count c = (arr == value).sum() AttributeError: 'bool' object has no attribute 'sum' """

The above exception was the direct cause of the following exception:

AttributeError Traceback (most recent call last) Cell In[198], line 2 1 import toad
----> 2 iv_info = toad.quality(df_train,'isDefault', iv_only=True)

File /data/anaconda3/envs/bxfqz/lib/python3.8/site-packages/toad/stats.py:399, in quality(dataframe, target, cpu_cores, iv_only, indicators, kwargs) 389 for name, series in frame.iteritems(): 390 jobs.append(delayed(column_quality)( 391 series, 392 target, (...) 396 kwargs 397 )) --> 399 rows = pool(jobs) 402 return pd.DataFrame(rows).sort_values( 403 by = indicators[0].name, 404 ascending = False, 405 )

File /data/anaconda3/envs/bxfqz/lib/python3.8/site-packages/joblib/parallel.py:1098, in Parallel.call(self, iterable) 1095 self._iterating = False 1097 with self._backend.retrieval_context(): -> 1098 self.retrieve() 1099 # Make sure that we get a last message telling us we are done 1100 elapsed_time = time.time() - self._start_time

File /data/anaconda3/envs/bxfqz/lib/python3.8/site-packages/joblib/parallel.py:975, in Parallel.retrieve(self) 973 try: 974 if getattr(self._backend, 'supports_timeout', False): --> 975 self._output.extend(job.get(timeout=self.timeout)) 976 else: 977 self._output.extend(job.get())

File /data/anaconda3/envs/bxfqz/lib/python3.8/site-packages/joblib/_parallel_backends.py:567, in LokyBackend.wrap_future_result(future, timeout) 564 """Wrapper for Future.result to implement the same behaviour as 565 AsyncResults.get from multiprocessing.""" 566 try: --> 567 return future.result(timeout=timeout) 568 except CfTimeoutError as e: 569 raise TimeoutError from e

File /data/anaconda3/envs/bxfqz/lib/python3.8/concurrent/futures/_base.py:439, in Future.result(self, timeout) 437 raise CancelledError() 438 elif self._state == FINISHED: --> 439 return self.__get_result() 440 else: 441 raise TimeoutError()

File /data/anaconda3/envs/bxfqz/lib/python3.8/concurrent/futures/_base.py:388, in Future.__get_result(self) 386 def __get_result(self): 387 if self._exception: --> 388 raise self._exception 389 else: 390 return self._result

AttributeError: 'bool' object has no attribute 'sum'

Secbone commented 1 year ago

@ivring2 do you use select function before quality?

yangjlx commented 1 year ago

我也遇到了一样的报错,0.1.1版本的,使用 fit 函数的时候。 具体报错是在 toad/utils/func.py def np_count(arr, value, default = None): c = (arr == value).sum() ...

AttributeError: 'bool' object has no attribute 'sum'