Update scikit-learn version to 0.21.3 in both requirements.txt and conda_requirements.txt.
Fix bug in DictVectorizer.__eq__(). We were never checking that we are actually comparing to another DictVectorizer instance. This means that comparing to a number or a string actually raised an exception rather than failing. This came out in this PR since 0.21.X explicitly adds a check whether a pipeline component is the string 'passthrough' which fails for our vectorizers. This instance condition should be made redundant once we add type hinting to the SKLL codebase.
Added a new test to cover the above instance check in DictVectorizer.__eq__().
Remove deprecated FeatureHasher attribute non_negative which has been removed in 0.21.X.
Remove train scores (splitX_train_scores, mean_train_scores etc.) from grid search CV results. Starting with 0.21, GridSearchCV no longer returns train scores by default unless return_train_scores is True when calling GridSearchCV. I considered adding this parameter to our GridSearchCV call in Learner.train() but decided against it since it will make things slower and we shouldn't really need to look at the train split scores, only the test split ones.
Update fixed parameters for SGDRegressor as the base estimator for AdaBoostRegressor since the old parameters aren't appropriate anymore.
Update several expected test results due to upstream changes in SGDClassifier, SGDRegressor and SVC.

Codecov Report

Merging #559 into master will not change coverage. The diff coverage is 100%.

@@           Coverage Diff           @@
##           master     #559   +/-   ##
=======================================
  Coverage   95.02%   95.02%           
=======================================
  Files          20       20           
  Lines        2992     2992           
=======================================
  Hits         2843     2843           
  Misses        149      149

Impacted Files	Coverage Δ
skll/data/dict_vectorizer.py	`100% <100%> (ø)`	:arrow_up:
skll/learner.py	`95.96% <100%> (ø)`	:arrow_up:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 4cfa233...3f0c1ad. Read the comment docs.

EducationalTestingService / skll

Update scikit-learn to 0.21.3 #559

Codecov Report