Non linear feature engineering for logistic regression

As a follow up for #701, I suggest that:

We replace the notebook currently named "Beyond linear separation in classification" by a new notebook named "Non-linear feature engineering for Logistic Regression"
In this notebook we reuse the same 2D synthetic moons and Gaussian quantiles datasets
We start with a logistic regression and shows that it underfits
Then we build more and more complex pipelines with different preprocessors:
KBinsDiscretizer
SplineTransformer
We observe that those transformers do axis-aligned non linear transformations that lead to axis aligned classification decision boundaries,
We explore modeling multiplicative interactions between the derived features with
KBinsDirectizer with sparse output followed by PolynomialFeatures(degree=2, interaction_only=True)
SplineTransformer followed by Nystroem (either with kernel="rbf" and a good value of gamma or kernel="poly" and degree=2)

Then we add a new exercise with:

The half moons dataset only
SVC(kernel="linear") (this should give similar underfitting results as logistic regression from the previous notebook
then ask the user to try:
- make_pipeline(Nystroem(kernel="rbf", gamma=some_gamma, n_components=300), SVC(kernel="linear")
- SVC(kernel="rbf", gamma=some_gamma)
the results should be similar

Then we can optionally suggest to try MLPClassifier on this dataset to get somewhat similar results.

INRIA / scikit-learn-mooc