Closed abdul0807 closed 4 years ago
Hi Abdul: Thanks for using Auto_NLP on your data set. I have found the problem and fixed it in the Github. You should see a message "Modified Sparse array to dense array after vectorizer since it was erroring" in the GitHub related to Auto_NLP which means it is the modified version. In order to use, you must pip install from the git version as follows in your Jupyter Notebook or wherever:
!python3 -m pip install git+https://github.com/AutoViML/Auto_ViML.git
If you have any further error, please let me know. Thanks
@AutoViML Thanks a lot for the prompt reply. Looks like the issue still exist and this time the error is found at a different line number. Please check the error below for details.
I saw the code and looks like the issue is because of using LassoLars model. We can actually use other models like Lasso, Ridge etc. Or we might have to add an extra class to the pipeline which convert sparse array to dense array.
error
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-11-34b02f496bd4> in <module>
4 nlp_column, train, test, target, score_type='mean_squared_error',
5 modeltype='Regression',top_num_features=50, verbose=2,
----> 6 build_model=True)
/opt/conda/lib/python3.7/site-packages/autoviml/Auto_NLP.py in Auto_NLP(nlp_column, train, test, target, score_type, modeltype, top_num_features, verbose, build_model)
1217 ##### Now AFTER TRAINING, make predictions on the given test data set!
1218 start_time = time.time()
-> 1219 pipe.fit(X,y)
1220 print('Training completed. Time taken for Auto_NLP = %0.1f minutes' %((time.time()-start_time4)/60))
1221 print('######### A U T O N L P C O M P L E T E D ###############################')
/opt/conda/lib/python3.7/site-packages/sklearn/pipeline.py in fit(self, X, y, **fit_params)
352 self._log_message(len(self.steps) - 1)):
353 if self._final_estimator != 'passthrough':
--> 354 self._final_estimator.fit(Xt, y, **fit_params)
355 return self
356
/opt/conda/lib/python3.7/site-packages/sklearn/linear_model/_least_angle.py in fit(self, X, y, Xy)
955 returns an instance of self.
956 """
--> 957 X, y = check_X_y(X, y, y_numeric=True, multi_output=True)
958
959 alpha = getattr(self, 'alpha', 0.)
/opt/conda/lib/python3.7/site-packages/sklearn/utils/validation.py in check_X_y(X, y, accept_sparse, accept_large_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, multi_output, ensure_min_samples, ensure_min_features, y_numeric, warn_on_dtype, estimator)
753 ensure_min_features=ensure_min_features,
754 warn_on_dtype=warn_on_dtype,
--> 755 estimator=estimator)
756 if multi_output:
757 y = check_array(y, 'csr', force_all_finite=True, ensure_2d=False,
/opt/conda/lib/python3.7/site-packages/sklearn/utils/validation.py in check_array(array, accept_sparse, accept_large_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, ensure_min_samples, ensure_min_features, warn_on_dtype, estimator)
509 dtype=dtype, copy=copy,
510 force_all_finite=force_all_finite,
--> 511 accept_large_sparse=accept_large_sparse)
512 else:
513 # If np.array(..) gives ComplexWarning, then we convert the warning
/opt/conda/lib/python3.7/site-packages/sklearn/utils/validation.py in _ensure_sparse_format(spmatrix, accept_sparse, dtype, copy, force_all_finite, accept_large_sparse)
304
305 if accept_sparse is False:
--> 306 raise TypeError('A sparse matrix was passed, but dense '
307 'data is required. Use X.toarray() to '
308 'convert to a dense numpy array.')
TypeError: A sparse matrix was passed, but dense data is required. Use X.toarray() to convert to a dense numpy array.
Ok you are right. I changed the model from LassoLARS to Linear SVR. It works faster and better. It should be good to go. Test it and let me know. I will keep the issue open until you confirm it works.
Thanks! This is working. Please close the issue.
Ok great all the best!
Hi AutoViML community,
Thank you for providing this amazing package.
I am trying my hands on a regression problem and ended up with typeerror. Please note that the same can be replicated in a kaggle kernel. The code and error is shared below for your reference.
Thanks -