Call _validate_steps in test_validate_steps

@bbengfort Will we need to upgrade to scikit v1.3.0. I worry about calling ._validate_steps()

The _validate_steps method in a scikit-learn Pipeline is a private method used to check whether the steps of the pipeline are defined correctly. In the pipeline, the steps should be structured such that all steps up to the final one should be transformers (i.e., they should have a fit and transform method), and the final step should be an estimator (i.e., it should have a fit method).

Calling _validate_steps() explicitly in your test cases will make sure that this validation is performed at the moment you define the pipeline, rather than later when you try to fit or transform data with the pipeline.

In @danilobellini 's code, adding _validate_steps() after the Pipeline or VisualPipeline instantiation will cause the validation to happen immediately. This means that if there's a problem with the steps (e.g., a non-transformer object in an intermediate step, or a non-estimator object as the final step), a TypeError will be raised immediately, rather than later on when you try to use the pipeline.

This could make the tests clearer and more direct, as he is specifically testing the validation of the pipeline steps, and it's useful to have that validation happen as explicitly and immediately as possible. However, I'm aware that _validate_steps is a private method (indicated by the leading underscore), which means that it's not part of the public API of the Pipeline class and could potentially change in future versions of scikit-learn. Using private methods can sometimes lead to less stable code, as they're not guaranteed to stay the same in the way that public methods are.

DistrictDataLabs / yellowbrick

Call _validate_steps in test_validate_steps #1307