Open RaymondBalise opened 3 years ago
@RaymondBalise, here is another slightly simpler approach. This uses the mnist data set provided by Keras. We were going to use this data set initially but decided not to since it is in a 3D array rather than the 2D dataframe provided by dslabs::read_mnist()
.
# Import MNIST data from Keras. This will import the data
# as a 3D array
mnist <- keras::dataset_mnist()
# Get our feature dimensions
mnist_train_dim <- dim(mnist$train$x)
train_nobs <- mnist_train_dim[1]
train_nfeat <- mnist_train_dim[2]*mnist_train_dim[3]
# Identify our sampled index
set.seed(123)
index <- sample(train_nobs, size = 10000)
# Convert features to 2D array, then to a dataframe
mnist_x_2d <- array(mnist$train$x, dim = c(train_nobs, train_nfeat))
mnist_x <- data.frame(mnist_x_2d)[index, ]
# extract response and convert to factor
mnist_y <- factor(mnist$train$y)[index]
I attempted to use the read_mnist() function from dslabs and it returned this error:
It looks like http://yann.lecun.com/exdb/mnist/ is no longer live but with a little help from the Brave browser I found an old image of the site using the wayback machine and I downloaded the files.
I modified the function to read the data out of local copies:
… and all is good. I don’t know the proper solution (other than hosting the files) but I figured I should share this in the hope it helps others.