Chapter 03 - Flight delays

Eraseri commented 3 years ago

On chapter 03 I have following problem. Scikit-learn version I have installed is 0.22.2.post1

I printed out model name where it stops (logistic regression)

`for model_name in class_models.keys(): print(model_name) fitted_model = class_models[model_name]['model'].fit(X_train, y_train_class) y_train_pred = fitted_model.predict(X_train.values) if model_name == 'ridge': y_test_pred = fitted_model.predict(X_test.values) else: y_test_prob = fitted_model.predict_proba(X_test.values)[:,1] y_test_pred = np.where(y_test_prob > 0.5, 1, 0) class_models[model_name]['fitted'] = fitted_model class_models[model_name]['probs'] = y_test_prob class_models[model_name]['preds'] = y_test_pred class_models[model_name]['Accuracy_train'] = metrics.accuracy_score(y_train_class, y_train_pred) class_models[model_name]['Accuracy_test'] = metrics.accuracy_score(y_test_class, y_test_pred) class_models[model_name]['Recall_train'] = metrics.recall_score(y_train_class, y_train_pred) class_models[model_name]['Recall_test'] = metrics.recall_score(y_test_class, y_test_pred) if model_name != 'ridge': class_models[model_name]['ROC_AUC_test'] = metrics.roc_auc_score(y_test_class, y_test_prob) else: class_models[model_name]['ROC_AUC_test'] = 0 class_models[model_name]['F1_test'] = metrics.f1_score(y_test_class, y_test_pred) class_models[model_name]['MCC_test'] = metrics.matthews_corrcoef(y_test_class, y_test_pred) logistic

AttributeError Traceback (most recent call last)

in 1 for model_name in class_models.keys(): 2 print(model_name) ----> 3 fitted_model = class_models[model_name]['model'].fit(X_train, y_train_class) 4 y_train_pred = fitted_model.predict(X_train.values) 5 if model_name == 'ridge': ~\miniconda3\envs\tensorflow\lib\site-packages\sklearn\linear_model\_logistic.py in fit(self, X, y, sample_weight) 1589 else: 1590 prefer = 'processes' -> 1591 fold_coefs_ = Parallel(n_jobs=self.n_jobs, verbose=self.verbose, 1592 **_joblib_parallel_args(prefer=prefer))( 1593 path_func(X, y, pos_class=class_, Cs=[C_], ~\miniconda3\envs\tensorflow\lib\site-packages\joblib\parallel.py in __call__(self, iterable) 1039 # remaining jobs. 1040 self._iterating = False -> 1041 if self.dispatch_one_batch(iterator): 1042 self._iterating = self._original_iterator is not None 1043 ~\miniconda3\envs\tensorflow\lib\site-packages\joblib\parallel.py in dispatch_one_batch(self, iterator) 857 return False 858 else: --> 859 self._dispatch(tasks) 860 return True 861 ~\miniconda3\envs\tensorflow\lib\site-packages\joblib\parallel.py in _dispatch(self, batch) 775 with self._lock: 776 job_idx = len(self._jobs) --> 777 job = self._backend.apply_async(batch, callback=cb) 778 # A job can complete so quickly than its callback is 779 # called before we get here, causing self._jobs to ~\miniconda3\envs\tensorflow\lib\site-packages\joblib\_parallel_backends.py in apply_async(self, func, callback) 206 def apply_async(self, func, callback=None): 207 """Schedule a func to be run""" --> 208 result = ImmediateResult(func) 209 if callback: 210 callback(result) ~\miniconda3\envs\tensorflow\lib\site-packages\joblib\_parallel_backends.py in __init__(self, batch) 570 # Don't delay the application, to avoid keeping the input 571 # arguments in memory --> 572 self.results = batch() 573 574 def get(self): ~\miniconda3\envs\tensorflow\lib\site-packages\joblib\parallel.py in __call__(self) 260 # change the default number of processes to -1 261 with parallel_backend(self._backend, n_jobs=self._n_jobs): --> 262 return [func(*args, **kwargs) 263 for func, args, kwargs in self.items] 264 ~\miniconda3\envs\tensorflow\lib\site-packages\joblib\parallel.py in (.0) 260 # change the default number of processes to -1 261 with parallel_backend(self._backend, n_jobs=self._n_jobs): --> 262 return [func(*args, **kwargs) 263 for func, args, kwargs in self.items] 264 ~\miniconda3\envs\tensorflow\lib\site-packages\sklearn\linear_model\_logistic.py in _logistic_regression_path(X, y, pos_class, Cs, fit_intercept, max_iter, tol, verbose, solver, coef, class_weight, dual, penalty, intercept_scaling, multi_class, random_state, check_input, max_squared_sum, sample_weight, l1_ratio) 936 options={"iprint": iprint, "gtol": tol, "maxiter": max_iter} 937 ) --> 938 n_iter_i = _check_optimize_result( 939 solver, opt_res, max_iter, 940 extra_warning_msg=_LOGISTIC_SOLVER_CONVERGENCE_MSG) ~\miniconda3\envs\tensorflow\lib\site-packages\sklearn\utils\optimize.py in _check_optimize_result(solver, result, max_iter, extra_warning_msg) 241 " https://scikit-learn.org/stable/modules/" 242 "preprocessing.html" --> 243 ).format(solver, result.status, result.message.decode("latin1")) 244 if extra_warning_msg is not None: 245 warning_msg += "\n" + extra_warning_msg AttributeError: 'str' object has no attribute 'decode'`

Eraseri commented 3 years ago

Leaving list of packages here: Package Version

absl-py 0.10.0 aif360 0.3.0 aiohttp 3.6.3 alibi 0.5.8 argon2-cffi 20.1.0 astunparse 1.6.3 async-generator 1.10 async-timeout 3.0.1 attrs 20.3.0 backcall 0.2.0 bleach 3.3.0 blinker 1.4 blis 0.7.4 Brotli 1.0.9 brotlipy 0.7.0 cachetools 4.1.1 catalogue 2.0.4 certifi 2021.5.30 cffi 1.14.5 chardet 3.0.4 click 7.1.2 cloudpickle 1.6.0 colorama 0.4.4 cryptography 3.1.1 cvae 0.0.3 cycler 0.10.0 cymem 2.0.5 dash 1.20.0 dash-core-components 1.16.0 dash-cytoscape 0.3.0 dash-html-components 1.1.3 dash-renderer 1.9.1 dash-table 4.11.3 deap 1.3.1 decorator 4.4.2 defusedxml 0.7.1 dill 0.3.4 entrypoints 0.3 Flask 2.0.1 Flask-Compress 1.10.1 flatbuffers 1.12 future 0.18.2 gast 0.3.3 gevent 21.1.2 google-auth 1.22.1 google-auth-oauthlib 0.4.1 google-pasta 0.2.0 greenlet 1.1.0 grpcio 1.32.0 h5py 2.10.0 idna 2.10 imageio 2.9.0 importlib-metadata 3.10.0 interpret 0.2.5 interpret-core 0.2.5 ipykernel 5.3.4 ipython 7.22.0 ipython-genutils 0.2.0 itsdangerous 2.0.1 jedi 0.17.0 Jinja2 3.0.1 joblib 1.0.1 jsonschema 3.2.0 jupyter-client 6.1.12 jupyter-core 4.7.1 jupyterlab-pygments 0.1.2 Keras-Applications 1.0.8 keras-nightly 2.5.0.dev2021032900 Keras-Preprocessing 1.1.2 kiwisolver 1.3.1 lime 0.2.0.1 llvmlite 0.36.0 machine-learning-datasets 0.1.16.4 Markdown 3.3.2 MarkupSafe 2.0.1 matplotlib 3.2.2 mistune 0.8.4 mkl-fft 1.3.0 mkl-random 1.2.1 mkl-service 2.3.0 mlxtend 0.14.0 multidict 4.7.6 multiprocess 0.70.12.2 murmurhash 1.0.5 nb-conda 2.2.1 nb-conda-kernels 2.3.1 nbclient 0.5.3 nbconvert 6.1.0 nbformat 5.1.3 nest-asyncio 1.5.1 networkx 2.5.1 notebook 6.4.0 numba 0.53.1 numpy 1.20.2 oauthlib 3.1.0 opencv-python 4.5.2.54 opt-einsum 3.3.0 packaging 20.9 pandas 1.2.5 pandocfilters 1.4.3 parso 0.8.2 pathlib2 2.3.5 pathos 0.2.8 pathy 0.6.0 patsy 0.5.1 pickleshare 0.7.5 Pillow 8.3.0 pip 21.1.2 plotly 5.1.0 pox 0.3.0 ppft 1.6.6.4 preshed 3.0.5 prometheus-client 0.11.0 prompt-toolkit 3.0.17 protobuf 3.13.0 psutil 5.8.0 pyasn1 0.4.8 pyasn1-modules 0.2.8 pycebox 0.0.1 pycparser 2.20 pydantic 1.7.4 Pygments 2.9.0 PyJWT 1.7.1 pyOpenSSL 19.1.0 pyparsing 2.4.7 pyreadline 2.1 pyrsistent 0.17.3 PySocks 1.7.1 python-dateutil 2.8.1 pytz 2021.1 PyWavelets 1.1.1 pywin32 227 pywinpty 0.5.7 pyzmq 20.0.0 requests 2.24.0 requests-oauthlib 1.3.0 rsa 4.6 rulefit 0.3.1 SALib 1.4.0.2 scikit-image 0.18.2 scikit-learn 0.22.2.post1 scipy 1.6.2 seaborn 0.11.1 Send2Trash 1.5.0 setuptools 52.0.0.post20210125 shap 0.39.0 six 1.16.0 sklearn-genetic 0.3.0 skope-rules 1.0.1 slicer 0.0.7 smart-open 5.1.0 spacy 3.0.6 spacy-legacy 3.0.6 spacy-lookups-data 1.0.2 srsly 2.4.1 statsmodels 0.10.2 tenacity 7.0.0 tensorboard 2.5.0 tensorboard-data-server 0.6.1 tensorboard-plugin-wit 1.6.0 tensorflow 2.4.1 tensorflow-docs 0.0.02d270bdf7da20b5ccd08e15812e57a174522b3cf- tensorflow-estimator 2.4.0 tensorflow-lattice 2.0.7 termcolor 1.1.0 terminado 0.9.4 testpath 0.5.0 tf-explain 0.2.1 tf-keras-vis 0.5.5 thinc 8.0.7 threadpoolctl 2.1.0 tifffile 2021.6.14 tornado 6.1 tqdm 4.41.1 traitlets 5.0.5 treeinterpreter 0.2.3 typer 0.3.2 typing-extensions 3.7.4.3 urllib3 1.25.11 wasabi 0.8.2 wcwidth 0.2.5 webencodings 0.5.1 Werkzeug 2.0.1 wheel 0.36.2 win-inet-pton 1.1.0 wincertstore 0.2 witwidget 1.7.0 wrapt 1.12.1 xai 0.0.4 xgboost 0.90 yarl 1.6.2 yolk3k 0.9 zipp 3.4.1 zope.event 4.5.0 zope.interface 5.4.0

smasis001 commented 3 years ago

You have different versions of sklearn and scipy than originally used to write and test the repository, and this seems to be the reason the code doesn't work. Often, a version update for one dependency such as scipy will make some features of another library like sklearn not work appropriately. You either adapt the code so that it works or you downgrade dependencies so that they don't cause any trouble.

For reference, see the following comparison between the library versions in requirements.txt on the left and the corresponding libraries you have installed on the right.

Requirements-vs-Eraseri

Incidentally, you also don't have beautifulsoup4 and a different version of requests installed than the one stated in requirements.txt. No wonder Chapter 1 code didn't work for you. Therefore, I'm closing issue 3.

I ask kindly that you ensure that you have the exact versions of the libraries installed before you post another issue since I can't guarantee that it will work with any other library versions nor help you debug issues arising from having different ones. You can always employ the Google Colab implementations for each notebook instead.

PacktPublishing / Interpretable-Machine-Learning-with-Python

Chapter 03 - Flight delays #5