microsoft / ML-For-Beginners

12 weeks, 26 lessons, 52 quizzes, classic Machine Learning for all
https://microsoft.github.io/ML-For-Beginners/
MIT License
69.7k stars 14.47k forks source link

R if in for loop error - how can I save selected model? #539

Closed TomJang closed 2 years ago

TomJang commented 2 years ago

I am new to R and have difficulties using "if" and "for-loop". sorry if it is duplicated.

as you can see a chuck of a code below, I try to create 100 lm models and save when the R is more than 0.7.

However, the code saved all 100 lm models.

I suspect the statement (!is.na(lm.cv.r[i]) < 0.60) is wrong but I cannot figure it out.

let's use USArrests data as an example

`` data("USArrests") head(USArrests) df.norm <- USArrests

set.seed(100)
lm.cv.mse <- NULL
lm.cv.r <- NULL
k <- 100

for(i in 1:k){

index.cv <- sample(1:nrow(df.norm),round(0.8*nrow(df.norm)))
df.cv.train <- df.norm[index.cv, ]
df.cv.test <- df.norm[-index.cv, ]

lm.cv <- glm(Rape~., data = df.cv.train) 

lm.cv.predicted <- predict(lm.cv, df.cv.test)

lm.cv.mse[i] <- sum((df.cv.test$target - lm.cv.predicted)^2)/nrow(df.cv.test)
lm.cv.r[i] <- as.numeric(round(cor(lm.cv.predicted, df.cv.test$target, method = "pearson"), digits = 3))

if (!is.na(lm.cv.r[i]) > 0.70){
  saveRDS(lm.cv, file = paste("lm.cv", lm.cv.r[i], ".rds", sep = ''))
}

} ``

jlooper commented 2 years ago

hi @R-icntay could you lend your expertise here please?

R-icntay commented 2 years ago

Hello @TomJang, @jlooper

Firstly, thank you for providing a reproducible example. You were almost there, so good job! The only thing that was missing is evaluating for a second condition, i.e if lm.cv.r[i] > 0.70. I have modified your example and it works as expected and ensures that R does not accidentally overwrite a similar previous value.

data("USArrests")
head(USArrests)
df.norm <- USArrests

set.seed(100)
lm.cv.mse <- NULL
lm.cv.r <- NULL
k <- 100

for(i in 1:k){

index.cv <- sample(1:nrow(df.norm),round(0.8*nrow(df.norm)))
df.cv.train <- df.norm[index.cv, ]
df.cv.test <- df.norm[-index.cv, ]

lm.cv <- glm(Rape~., data = df.cv.train) 

lm.cv.predicted <- predict(lm.cv, df.cv.test)

lm.cv.mse[i] <- sum((df.cv.test$rape - lm.cv.predicted)^2)/nrow(df.cv.test)
lm.cv.r[i] <- as.numeric(round(cor(lm.cv.predicted, df.cv.test$Rape, method = "pearson"), digits = 3))

if (!is.na(lm.cv.r[i]) && lm.cv.r[i] > 0.70){
  saveRDS(lm.cv, file = paste("lm.cv", i, lm.cv.r[i], ".rds", sep = '_'))
}
}

We invite you to check to check out our R lessons that show you how to build Machine Learning models using the Tidymodels framework: https://github.com/microsoft/ML-For-Beginners.

Do enjoy the ride and feel free to reach out in case of any difficulty. Happy leaRning!

TomJang commented 2 years ago

Thanks Eric,

Apologies for the hassle with a tiny mistake. It took me a lot of time to figure it out!

Kind regards, Tom

From: Eric @.> Sent: Saturday, 26 February 2022 8:20 PM To: microsoft/ML-For-Beginners @.> Cc: Hyong Doo Jang @.>; Mention @.> Subject: Re: [microsoft/ML-For-Beginners] R if in for loop error - how can I save selected model? (Issue #539)

Hello @TomJanghttps://github.com/TomJang,

Firstly, thank you for providing a reproducible example. You were almost there, so good job! The only thing that was missing is evaluating for a second condition, i.e if lm.cv.r[i] > 0.70. I have modified your example and it works as expected:

data("USArrests")

head(USArrests)

df.norm <- USArrests

set.seed(100)

lm.cv.mse <- NULL

lm.cv.r <- NULL

k <- 100

for(i in 1:k){

index.cv <- sample(1:nrow(df.norm),round(0.8*nrow(df.norm)))

df.cv.train <- df.norm[index.cv, ]

df.cv.test <- df.norm[-index.cv, ]

lm.cv <- glm(Rape~., data = df.cv.train)

lm.cv.predicted <- predict(lm.cv, df.cv.test)

lm.cv.mse[i] <- sum((df.cv.test$rape - lm.cv.predicted)^2)/nrow(df.cv.test)

lm.cv.r[i] <- as.numeric(round(cor(lm.cv.predicted, df.cv.test$Rape, method = "pearson"), digits = 3))

if (!is.na(lm.cv.r[i]) && lm.cv.r[i] > 0.70){

saveRDS(lm.cv, file = paste("lm.cv", i, lm.cv.r[i], ".rds", sep = '_'))

}

}

— Reply to this email directly, view it on GitHubhttps://github.com/microsoft/ML-For-Beginners/issues/539#issuecomment-1052104710, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ALVVLRWWJ7E7LHO4MANQAMDU5DAQVANCNFSM5PDDBC4Q. Triage notifications on the go with GitHub Mobile for iOShttps://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Androidhttps://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub. You are receiving this because you were mentioned.Message ID: @.**@.>>

jlooper commented 2 years ago

all set? should I close this? thanks everyone!

R-icntay commented 2 years ago

Yes yes Jen.

All good here!