juliasilge / juliasilge.com

My blog, built with blogdown and Hugo :link:
https://juliasilge.com/
41 stars 27 forks source link

Predict #TidyTuesday giant pumpkin weights with workflowsets | Julia Silge #54

Open utterances-bot opened 2 years ago

utterances-bot commented 2 years ago

Predict #TidyTuesday giant pumpkin weights with workflowsets | Julia Silge

Get started with tidymodels workflowsets to handle and evaluate multiple preprocessing and modeling approaches simultaneously, using pumpkin competitions.

https://juliasilge.com/blog/giant-pumpkins/

gunnergalactico commented 2 years ago

Hi Dr. Silge, could I pass in my own metrics e.g metric_set(mae) or metric_set(mn_log_loss) instead of going with the defaults?

Thanks.

juliasilge commented 2 years ago

@gunnergalactico Yes, you do it like this although there is currently a bug in yardstick for a few of the non-default metrics (shown in that issue).

wdkeyzer commented 2 years ago

Hi Julia, I learn a lot from your blog posts and video's. Thank you for sharing this! I also like your RStudio theme, which one do you use?

juliasilge commented 2 years ago

@wdkeyzer I use the rsthemes package, and I believe I am using Oceanic Plus right now.

mkrasmus commented 2 years ago

Dear Dr Silge, I really like how linear regression models (including log reg) convey the strength of predictors via coefficient estimates and how intuitive they are even for lay stakeholders to better understand how specific factors contribute to the balance of the model and predictions. I have used Lime and variable importance in the past for other models like random forest and just feel in terms of transparency (and therefore responsible AI/ML) they are a bit less interpretable. Is there general consensus opinion or resource you are aware of that in a way ranks these methods in terms of interpretability, model transparency etc?

juliasilge commented 2 years ago

@mkrasmus Hmmmm, I'm not aware of an official ranking of different kinds of models but I do think most practitioners consider linear models more interpretable than more complex models + interpretability methods on top of them. People typically use those less interpretable models because they need better model performance.

ajay333a commented 9 months ago

Hi @juliasilge, I'm trying to

tidy(final_fit) %>%
  arrange(-abs(estimate))

but it's giving me an error Error: No tidy method for objects of class ranger can't seem to figure out why, I tried using broom : : tidy( ) but it's not working. Is there any other way?

juliasilge commented 9 months ago

@ajay333a You can tidy() a linear model because it has coefficients, but a random forest model is made up of a whole aggregation of trees so there isn't anything really to show if you tidy(). In my blog post/video, I choose the "recipe_3_linear_reg" workflow so I can tidy it, but if you choose one of the random forest models, you won't be able to use tidy(). Instead, you might consider computing the variable importance like in this post.

ajay333a commented 9 months ago

Thank you very much for your reply

On Tue, 19 Dec, 2023, 00:50 Julia Silge, @.***> wrote:

@ajay333a https://github.com/ajay333a You can tidy() a linear model because it has coefficients, but a random forest model is made up of a whole aggregation of trees so there isn't anything really to show if you tidy(). In my blog post/video, I choose the "recipe_3_linear_reg" workflow so I can tidy it, but if you choose one of the random forest models, you won't be able to use tidy(). Instead, you might consider computing the variable importance like in this post https://juliasilge.com/blog/water-sources/.

— Reply to this email directly, view it on GitHub https://github.com/juliasilge/juliasilge.com/issues/54#issuecomment-1861397232, or unsubscribe https://github.com/notifications/unsubscribe-auth/A3AHZWP5KIJ4N4STAG62RRDYKCJRDAVCNFSM5GSVMTLKU5DIOJSWCZC7NNSXTN2JONZXKZKDN5WW2ZLOOQ5TCOBWGEZTSNZSGMZA . You are receiving this because you were mentioned.Message ID: @.***>