lsinks / blog-comments

A repo for comments on my blog
0 stars 0 forks source link

posts/2023-04-10-tidymodels/tidymodels_tutorial #2

Open utterances-bot opened 1 year ago

utterances-bot commented 1 year ago

Louise E. Sinks - A Tidymodels Tutorial: A Structural Approach

Exploring the different steps for modeling

https://lsinks.github.io/posts/2023-04-10-tidymodels/tidymodels_tutorial

mintorcia commented 1 year ago

Hi Louise, Thanks for making available your tutorial on Tidymodels. I would like to replicate it with R but I do not know how to upload the dataset. Would you please let me know how/where I can download or access the dataset. Many Thanks in Advance, Michele

lsinks commented 1 year ago

Michele, Sorry for not being clear! You can find the data on the github repo for this website.
https://github.com/lsinks/lsinks.github.io/tree/main/posts/2023-04-10-tidymodels

You can also download the qmd file, which is a quarto document for this page. It is like R markdown, if you know that better. It is a mix of text and executable code blocks. (You can always delete all the text and save it as an R file if you prefer.)

Let me know if this doesn't work for some reason. Louise

lsinks commented 1 year ago

Michele, I wanted to thank you for your question about the data source. I updated the blog post to include the information also. Louise

mintorcia commented 1 year ago

Dear Louise, Thanks for your email and apologies for my delay. I want to thank you for your tutorial and help in running the code. I am following it through. All the best, Michele

Il Mer 12 Apr 2023, 23:15 Louise E. Sinks @.***> ha scritto:

Michele, I wanted to thank you for your question about the data source. I updated the blog post to include the information also. Louise

— Reply to this email directly, view it on GitHub https://github.com/lsinks/blog-comments/issues/2#issuecomment-1505958366, or unsubscribe https://github.com/notifications/unsubscribe-auth/A7EE3BFFXPXPJWLLQHTEES3XA4LOBANCNFSM6AAAAAAW3T4QW4 . You are receiving this because you commented.Message ID: @.***>

MealdyTech commented 1 year ago

Loved this thanks very much Louise

lsinks commented 1 year ago

Glad you liked it! There are so many great resources out there, but I just had a hard time putting it all together.

On Wed, Apr 26, 2023 at 12:44 AM MealdyTech @.***> wrote:

Loved this thanks very much Louise

— Reply to this email directly, view it on GitHub https://github.com/lsinks/blog-comments/issues/2#issuecomment-1522773285, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACMY2WF3OJPB3VFWAFXVSDLXDCR3JANCNFSM6AAAAAAW3T4QW4 . You are receiving this because you commented.Message ID: @.***>

spdrnl commented 1 year ago

Hi Louise,

Thank you for taking the effort to make this public.

Tidymodels, and the objects going in and out indeed seem opaque. Sklearn is more transparent in its approach, certainly from an object and programming perspective.

I think that Tidymodels is after a configuration over programming approach, this in itself is a good thing. Configurable approaches are great if you want to crank out, or automate, a larger amount of models. The configuration is done in code though, and that makes it a bit confusing.

Thank you for your insights.

Cheers,

Sanne

lsinks commented 1 year ago

Sorry for the late reply- I've been thinking about your comment, especially the contrast between tidymodels and sklearn. I think the tidyverse, in general, tries to be a bit more black-box about the programming. (And maybe that's true of R in general, but I think the tidy approach is an even higher level of abstraction than R.) They are a set of tools that allow you to do the tasks, but without worrying about lower-level details. I think this is really apparent in tidymodels- the objects aren't clearly explained because you shouldn't need to worry about them. There is great documentation about how to move the objects through the process because getting the job done is the point. But I wanted to see what these objects were, because I wanted to make sure I was understanding properly, and I couldn't find easy explanations. (And the answers are in the documentation, but often as an aside.)

sklearn/ python are clearer about what the objects are, but the downside is you have to pay way more attention to them. I feel like I'm always dealing with low-level details in python that are abstracted away in R. Part of this may be because I'm not as proficient in python as I am in R, which generally increases the frustration level as I'm trying to do something cool and instead get bogged down.

On Mon, Aug 14, 2023 at 11:18 AM spdrnl @.***> wrote:

Hi Louise,

Thank you for taking the effort to make this public.

Tidymodels, and the objects going in and out indeed seem opaque. Sklearn is more transparent in its approach, certainly from an object and programming perspective.

I think that Tidymodels is after a configuration over programming approach, this in itself is a good thing. Configurable approaches are great if you want to crank out, or automate, a larger amount of models. The configuration is done in code though, and that makes it a bit confusing.

Thank you for your insights.

Cheers,

Sanne

— Reply to this email directly, view it on GitHub https://github.com/lsinks/blog-comments/issues/2#issuecomment-1677531711, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACMY2WAT2NPLB6CNUFFRDOLXVI6S5ANCNFSM6AAAAAAW3T4QW4 . You are receiving this because you commented.Message ID: @.***>

spdrnl commented 1 year ago

For me the difference between python and R is a cultural difference. Python has more of an engineering culture. So there is a lot of emphasis on code and software, and generating that. So much so, that is seems to be the goal. Perhaps being a general language has its downsides. R has more a scientific culture, so it is more focused on the empirical and statistical side of things; and is in general more sophisticated that way.

Currently I am cracking tidymodels by way of Applied Predictive Modeling; both written by Max Kuhn. The APM book is excellent from a statistical and machine learning perspective, and has an tidymodels repo from Max under his username topepo.

Although I did not start that way, I think that cracking tidymodels is best done top down; meaning using workflow sets to create boxplot diagrams of say ROC confidence intervals and test set bootstrapping intervals: focusing on the beef of decision making around models. (I attached an example plot, presented in the book page 508). Piecing the details of tidymodels together is hard work. This brings me back to your observation, R/tidymodels is indeed more about understanding how to create a workflow around all the information contained in a machine learning project, and how to balance statistics and machine learning; without focusing on the implementation details.

image

drkamarul commented 9 months ago

Thank you for the tutorial

lsinks commented 9 months ago

I hope it was helpful!

On Thu, Dec 7, 2023 at 10:43 PM Kamarul @.***> wrote:

Thank you for the tutorial

— Reply to this email directly, view it on GitHub https://github.com/lsinks/blog-comments/issues/2#issuecomment-1846504171, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACMY2WBF7JBYMWFF2EFQ2QTYIKEEVAVCNFSM6AAAAAAW3T4QW6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNBWGUYDIMJXGE . You are receiving this because you commented.Message ID: @.***>