Text vectorizers clean-up / TfidfVectorizer deprecation

As mentioned by @jnothman in https://github.com/scikit-learn/scikit-learn/pull/14748#issuecomment-530160881,

I'm also happy to slowly deprecate TfidfVectorizer because it provides no benefit over a pipeline and creates much confusion. I've seen users do weird stuff like compare TfidfVectorizer to CountVectorizer but use different params.

+1 on my side to deprecate it.

I would also deprecate norm parameter (for removal) in HashingVectorizer as its combination with TfidfTransformer using default parameters currently produces nonsense results due to norm='l2'. That would also resolve https://github.com/scikit-learn/scikit-learn/issues/6972

Generally I have also seen experienced Python users do very weird things with text vectorizers (particularly as soon as customization is involved). Taking some time to think what how we would like its API to look ideally for 1.0 and if we can get partially there without major disruption would, I think, be useful.

In particular, I find the way we currently suggest customizing the behavior by subclassing CountVectorizer to update the analyzer is really awkward. I am pondering on some way to separate the pipeline that processes a single document and returns n-grams, from the CountVectorizer class. Basically making passing analyzers easier while re-using part of the existing processing.

Anyway, we can start with easy deprecations.

scikit-learn / scikit-learn

Text vectorizers clean-up / TfidfVectorizer deprecation #14951