Oakwilde commented 1 month ago

It would be very useful if there were partial dependence plots and variable important plots like there are for random forests and for gradient boosting (with help of the pdp package). I had to code up my own pdp and even with the help of chatgpt, it took a couple of hours and still may have some bugs. There were unanticipated challenges such as the need to scale my predictors for the deep neural net, but then get the partial dependence plots back in their usual scales (e.g., temperature in K). Or maybe ask the authors of pdp (in R) to add some code so that keras3 output could be used in their very nice partial dependence plots. Thanks.

t-kalinowski commented 1 month ago

Thanks for opening! Have you also opened an issue on the pdp repo? If so, can you link to it here?

Would you be able to share the first draft of your plotting code?

Oakwilde commented 1 month ago

Thanks for getting back to me. The idea is to bring to Keras3 at least some of the tools (as imperfect as they are) for making neural nets ML more interpretable. You may already be on top of this, but if not, a very accessible introduction can be found in Molnar’s book “Interpretable Machine Learning…”

I am happy to share the code. Three caveats. First, there are some bugs. For example, the order in which the predicted probabilities are stored and accessed seems to be in the wrong order. I want P(yhat=1) and I get P(yhat=0) (I think). So I have a trivial patch that fixes the problem assuming I have diagnosed it properly. Second, this is a full collaboration with Chatgpt. We went back and forth for quite a while swapping suggestions for fixes and code. I mostly was the diagnostician. Chatgpt wrote most of the code. If nothing else, Chaptgpt can write code far more quickly that I can and without typos! Third, I just learned that you can write a translator for R’s package pdp to get from Keras to inputs pdp can use. If that’s right and you like was pdp does, that may be a safer and easier way than writing some version of pdp from scratch. I am not a real coder. I write stuff in R for the kinds of data I use that runs often by brute force.

So… I will send the code I have later today and also let you know if I am able to make progress with the keras3 to pdp translator. And I will reach out to the pdp coders. Thanks for that link.

Richard

From: Tomasz Kalinowski @.> Date: Tuesday, October 8, 2024 at 8:31 AM To: rstudio/keras3 @.> Cc: Berk, Richard A @.>, Author @.> Subject: Re: [rstudio/keras3] A suggestion for pdp's (Issue #1474)

Thanks for opening! Have you also opened an issue on the pdp repohttps://urldefense.com/v3/__https:/github.com/bgreenwell/pdp/issues__;!!IBzWLUs!Xo0hNyliopdhL5wmMu5DIhwxsFcIJEz_XzJSp8llbx-HQhxgORJISSYnJpyfZKnT0308pQrnKG3gRB2vjZRL_3NonQ$? If so, can you link to it here?

Would you be able to share the first draft of your plotting code?

— Reply to this email directly, view it on GitHubhttps://urldefense.com/v3/__https:/github.com/rstudio/keras3/issues/1474*issuecomment-2399714530__;Iw!!IBzWLUs!Xo0hNyliopdhL5wmMu5DIhwxsFcIJEz_XzJSp8llbx-HQhxgORJISSYnJpyfZKnT0308pQrnKG3gRB2vjZTpUBbT5w$, or unsubscribehttps://urldefense.com/v3/__https:/github.com/notifications/unsubscribe-auth/AHFKAWVKUEEDCJTH6FOXNVTZ2PGAXAVCNFSM6AAAAABPRJYQE6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGOJZG4YTINJTGA__;!!IBzWLUs!Xo0hNyliopdhL5wmMu5DIhwxsFcIJEz_XzJSp8llbx-HQhxgORJISSYnJpyfZKnT0308pQrnKG3gRB2vjZRKgcRlBg$. You are receiving this because you authored the thread.Message ID: @.***>

Oakwilde commented 1 month ago

This is not a code first draft. It runs thanks to help I got from chatgpt. But it took a lot of time in part because I had not used Keras much before nor ggplot2. And Chatgpt sometimes gave me bad advice. There were two difficult issues for me near the end. I think that in keras3, the predict functions returns the p(Y=0) not p(Y=1) when the outcome is a binary variable coded 1 or 0. I was expecting p(Y=1). Of course the error may be mine. Then, after standardizing my predictors to make the fitting go much easily, I was stumped for a while getting ggplot to use the original units of the target variable not the standardized units. I am not sure I did it right, but the results are pretty much the same as what I get from the randomForest pdp plots. So yes, this is doable. I now have a sort of template. But this is not a plug and play approach.

Richard

Get Data

setwd("/Volumes/Data in Recent Use/NewJPL/ForJune21") load("/Volumes/Data in Recent Use/NewJPL/ForJune21/AIRS.06/JuneJulyForecasting.rdata") summary(working1) work1<-working1[,c(1,6,15,20)]

Scale the predictors or in back propagation there are computation problems.

x_train <- scale(as.matrix(work1[,c(2,3,4)])) y_train <- as.matrix(work1[,1])

Scale the predictors or in back propagation there are computation problems.

This is fake test data through sampling for the training data. OK for now.

index <- sample(260, 100, replace = TRUE) x_test <- scale(as.matrix(x_train[index,])) y_test <- as.matrix(y_train[index]) TestData<-(cbind(y_test,x_test))

load library

library(keras3)

Fit a Keras Model

model <- keras_model_sequential() %>% layer_dense(units = 128, activation = 'relu', input_shape = c(ncol(x_train))) %>% layer_dense(units = 64, activation = 'relu') %>% layer_dense(units = 1, activation = 'sigmoid') # Seems to output p(Y=0) not p(Y=1)

model %>% compile( loss = 'binary_crossentropy', optimizer = optimizer_adam(), metrics = c('accuracy') )

model %>% fit( x_train, y_train, epochs = 10, batch_size = 32, validation_split = 0.2 )

Load libraries

library(pdp) library(ggplot2) library(iml) library(dplyr)

Define the custom prediction function for Keras model

getting the data in shape for predict()

predict_keras <- function(object, newdata) {

Convert newdata (data frame) to matrix for Keras model predictions

newdata_matrix <- as.matrix(newdata)

Ensure that the matrix has the correct number of columns (3 in this case)

if (ncol(newdata_matrix) != 3) { stop("Input data does not have the required number of features (3)") }

Get predictions from the model using a predictor matrix

predictions <- object %>% predict(newdata_matrix)

Return the predicted probabilities for fitting class "1" ?????????

return(as.vector(1 - predictions)) # Subtracting from 1 to get for class "1" }

Convert x_test to a data frame

feature_data <- as.data.frame(x_test) # Ensure x_test is now a data frame

Generate the partial dependence data (which includes ICE by default)

pdp_keras <- partial( object = model, # My Keras model pred.var = "temp8", # The feature for which you want the PDP pred.fun = predict_keras, # Custom prediction function for the Keras model train = feature_data, # Ensure x_test is now a data frame grid.resolution = 50, # Resolution of the grid (optional, can be adjusted) plot = FALSE # Disable automatic plotting )

A Fix: “Manually” aggregate the ICE data by averaging yhat over temp8

pdp_avg <- aggregate(yhat ~ temp8, data = pdp_keras, FUN = mean)

Calculate mean and standard deviation from the original data

mean_temp8 <- mean(work1$temp8) sd_temp8 <- sd(work1$temp8)

Add the original temp8 values to pdp_avg for plotting

pdp_avg$original_temp8 <- pdp_avg$temp8 * sd_temp8 + mean_temp8

Plot the averaged PDP using ggplot with original scale

ggplot(pdp_avg, aes(x = original_temp8, y = yhat)) + geom_smooth(color = "blue", size = 1.5, se = FALSE, method = "loess", span = 1/4) + geom_rug(data = work1, mapping = aes(x = temp8), sides = "b", inherit.aes = FALSE) + # Original data for rug plot ggtitle("Partial Dependence Plot for Temperature at Altitude 8") + xlab("Temperature (Kelvin)") + # Customize x-axis label ylab("Fitted Probability of a Heat Wave") + # Customize y-axis label theme( plot.title = element_text(hjust = 0.5, size = 16), # Center and increase title size axis.title.x = element_text(size = 14), # Increase x-axis label font size axis.title.y = element_text(size = 14), # Increase y-axis label font size axis.text = element_text(size = 12) # Increase axis text size ) + coord_cartesian(ylim = c(0, 1)) # Constrain y-axis between 0 and 1

From: Tomasz Kalinowski @.> Date: Tuesday, October 8, 2024 at 8:31 AM To: rstudio/keras3 @.> Cc: Berk, Richard A @.>, Author @.> Subject: Re: [rstudio/keras3] A suggestion for pdp's (Issue #1474)

Thanks for opening! Have you also opened an issue on the pdp repohttps://urldefense.com/v3/__https:/github.com/bgreenwell/pdp/issues__;!!IBzWLUs!Xo0hNyliopdhL5wmMu5DIhwxsFcIJEz_XzJSp8llbx-HQhxgORJISSYnJpyfZKnT0308pQrnKG3gRB2vjZRL_3NonQ$? If so, can you link to it here?

Would you be able to share the first draft of your plotting code?

— Reply to this email directly, view it on GitHubhttps://urldefense.com/v3/__https:/github.com/rstudio/keras3/issues/1474*issuecomment-2399714530__;Iw!!IBzWLUs!Xo0hNyliopdhL5wmMu5DIhwxsFcIJEz_XzJSp8llbx-HQhxgORJISSYnJpyfZKnT0308pQrnKG3gRB2vjZTpUBbT5w$, or unsubscribehttps://urldefense.com/v3/__https:/github.com/notifications/unsubscribe-auth/AHFKAWVKUEEDCJTH6FOXNVTZ2PGAXAVCNFSM6AAAAABPRJYQE6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGOJZG4YTINJTGA__;!!IBzWLUs!Xo0hNyliopdhL5wmMu5DIhwxsFcIJEz_XzJSp8llbx-HQhxgORJISSYnJpyfZKnT0308pQrnKG3gRB2vjZRKgcRlBg$. You are receiving this because you authored the thread.Message ID: @.***>

rstudio / keras3

A suggestion for pdp's #1474

Get Data

Scale the predictors or in back propagation there are computation problems.

Scale the predictors or in back propagation there are computation problems.

This is fake test data through sampling for the training data. OK for now.

load library

Fit a Keras Model

Load libraries

Define the custom prediction function for Keras model

getting the data in shape for predict()

Convert newdata (data frame) to matrix for Keras model predictions

Ensure that the matrix has the correct number of columns (3 in this case)

Get predictions from the model using a predictor matrix

Return the predicted probabilities for fitting class "1" ?????????

Convert x_test to a data frame

Generate the partial dependence data (which includes ICE by default)

A Fix: “Manually” aggregate the ICE data by averaging yhat over temp8

Calculate mean and standard deviation from the original data

Add the original temp8 values to pdp_avg for plotting

Plot the averaged PDP using ggplot with original scale