DistrictDataLabs / yellowbrick

Visual analysis and diagnostic tools to facilitate machine learning model selection.
http://www.scikit-yb.org/
Apache License 2.0
4.29k stars 559 forks source link

Adjusting markersize in `prediction_error` #1298

Closed LSYS closed 1 year ago

LSYS commented 1 year ago

Thank you for the fantastic library to visualize diagnostics. How do I adjust markersize in prediction_error()?

I am using this as reference

from sklearn.linear_model import Lasso
from sklearn.model_selection import train_test_split

from yellowbrick.datasets import load_concrete
from yellowbrick.regressor import prediction_error

# Load a regression dataset
X, y = load_concrete()

# Create the train and test data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Instantiate the linear model and visualizer
model = Lasso()
visualizer = prediction_error(model, X_train, y_train, X_test, y_test)

but my sample is larger and so the resulting plot just looks like a cloud. image

Smaller markers would help with visualization. Is that adjustment possible?

I've tried passing the "s" size kwarg

visualizer = prediction_error(model, X_train, y_train, X_test, y_test, scatter_kwargs={"s": 50})

but that did not work.

rebeccabilbro commented 1 year ago

Hello @LSYS and thank you for using Yellowbrick!

This is an excellent question. Full response below, but the tl;dr is that the behavior you're requesting isn't currently possible, though if you are willing to work on this and open a PR, it probably wouldn't be too hard to add support for changing marker size in PredictionError.

Ok, the details:

For most YB plots that plot points or scatterplots, the marker size is left to the default of whatever Matplotlib sets the size/shape to. The one exception is ValidationCurve, which allows the user to pass in a markers param that gets used inside the draw method to influence the markers. This is a fairly recent change implemented by @lwgray , who may have some additional thoughts here. It should be possible to make similar changes to the PredictionError constructor to take in a user-provided param that gets invoked inside the draw method.

If you're interested in moving forward with contributing, check out the contributor docs and let us know; I'd be happy to help coach you through and review your PR.

LSYS commented 1 year ago

@rebeccabilbro Thanks so much for the fast and detailed response. When I have time I'll look in a bit more detail at the API design of yellowbrick to see if I can contribute. Perhaps learning how yellowbrick interfaces with matplotlib would also be helpful for my own work.

Thanks again! Closing this issue (for now).