DistrictDataLabs / yellowbrick

Visual analysis and diagnostic tools to facilitate machine learning model selection.
http://www.scikit-yb.org/
Apache License 2.0
4.3k stars 559 forks source link

problem with auto plotting data #1183

Closed maghaali closed 3 years ago

maghaali commented 3 years ago
model = KElbowVisualizer(KMeans(init='k-means++', max_iter=10000, n_init=10), k=(4,12))
model.fit(X)
model.elbow_value_

Since I did not use mode.show (), I expect the above code not to show me a plot, but it does.

Desktop :

jhgrey-su commented 3 years ago

YES. We need a way to output the elbow values without the plot itself showing.

rebeccabilbro commented 3 years ago

Hi @spacelover92 @jhgrey-su, and thanks for using Yellowbrick!

The auto-plotting you're observing is likely related to Jupyter notebook and not Yellowbrick. Jupyter notebook is setting the inline backend as the default. In older versions of Jupyter notebook, you would have to turn on inline plotting manually with %matplotlib inline; nowadays it is set as turned on by default. It's similar to the reason why doing print(viz.elbow_value_) is the same as viz.elbow_value_ when you're inside a Jupyter notebook.

If you run the same Yellowbrick code in the command line, you won't get a plot without the call to show(). You can also explicitly pass show=False to a Visualizer to ensure the plot isn't shown (note that the plot may still show inside a Jupyter notebook where inline is turned on):

model = KMeans(
    init='k-means++', 
    max_iter=10000, 
    n_init=10
)
viz = KElbowVisualizer(model, k=(4,12), show=False)
viz.fit(X)
print(viz.elbow_value_)

It appears you can circumvent plotting in a Jupyter notebook if that solution is more appropriate to you.

Thanks again for being Yellowbrick users!

jhgrey-su commented 3 years ago

Hey @rebeccabilbro thanks so much for the quick response. I am loving using yellowbrick. Can show=False be used in KElbowVisualizer or can it only be used in the quick method as I did not see a show parameter in the API. Also I am still getting a plot displayed when running it in pycharm IDE. Thanks again for all your help!

jhgrey-su commented 3 years ago

@rebeccabilbro To follow up, I am now trying both elbow visualizers and regardless of if show=False, I am still having a plot outputted in pycharm. Could this be a similar issue to Jupyter notebook's inline settings? Thanks again.

rebeccabilbro commented 3 years ago

@jhgrey-su — thank you for providing the additional details! Yes, it is possible this is something that is also hardcoded into the PyCharm IDE. Are you using Scientific Mode? If so, you may be able to update your preferences locally to suppress the plots.

Would you mind sharing the code you are running? I am a VSCode user, but if you will share the code snippets you are running, I can download a community edition of PyCharm and attempt to replicate.

maghaali commented 3 years ago

Hi @spacelover92 @jhgrey-su, and thanks for using Yellowbrick!

The auto-plotting you're observing is likely related to Jupyter notebook and not Yellowbrick. Jupyter notebook is setting the inline backend as the default. In older versions of Jupyter notebook, you would have to turn on inline plotting manually with %matplotlib inline; nowadays it is set as turned on by default. It's similar to the reason why doing print(viz.elbow_value_) is the same as viz.elbow_value_ when you're inside a Jupyter notebook.

If you run the same Yellowbrick code in the command line, you won't get a plot without the call to show(). You can also explicitly pass show=False to a Visualizer to ensure the plot isn't shown (note that the plot may still show inside a Jupyter notebook where inline is turned on):

model = KMeans(
    init='k-means++', 
    max_iter=10000, 
    n_init=10
)
viz = KElbowVisualizer(model, k=(4,12), show=False)
viz.fit(X)
print(viz.elbow_value_)

It appears you can circumvent plotting in a Jupyter notebook if that solution is more appropriate to you.

Thanks again for being Yellowbrick users!

Thank you for your answer Adding

plt.close("all")

at the end of code cell worked for me in jupyter notebook

jhgrey-su commented 3 years ago

Hey @rebeccabilbro I do not think I am using scientific mode. I could be mistaken, but it might only be a professional feature with pycharm. Below is the code I am trying to run but, when the function is called, I am getting a plot displayed. I did add the plt.close("all") call and that did seem to help but I still saw 3 pop-up windows come up before they were closed on its own. Thanks so much again.

def optimal_k(self):    
    self.optimal = pd.DataFrame(columns=["Metric", "Elbow at k="])
    for m in ["silhouette", "distortion", "calinski-harabasz"]:
        plot= kelbow_visualizer(model=KMeans(), X = file_scaled, k=11, metric=m, show=False)     
        k_value = plot.elbow_value_     
        new_row = {"Metric": m, "Elbow at k=": k_value}     
        self.optimal = self.optimal.append(new_row, ignore_index=True)     
    return self.optimal
bbengfort commented 3 years ago

@jhgrey-su glad that you figured it out!