Closed gsgxnet closed 3 years ago
To corner down the error I tried a shortened loop:
for (b in enumerate(train_dl)) {
optimizer$zero_grad()
output <- model(b[[1]]$to(device = "cuda"))
}
fails nearly the same way.
Fehler in mget(x = c("input", "weight", "bias", "stride", "padding", "dilation", :
Versuch eine Nicht-Funktion anzuwenden
38.
mget(x = c("input", "weight", "bias", "stride", "padding", "dilation",
"groups"))
37.
torch_conv2d(input = input, weight = weight, bias = bias, stride = stride,
padding = padding, dilation = dilation, groups = groups)
36.
nnf_conv2d(input, weight, self$bias, self$stride, self$padding,
self$dilation, self$groups)
35.
self$conv_forward_(input, self$weight)
34.
self$conv1(.)
33.
mget(x = c("self"))
32.
torch_relu(input)
31.
nnf_relu(.)
...
str(b)
is:
Class 'enum_env' <environment: 0x55b8ff8ce788>
For me this is no pointer to the solution. Anyone?
Hi, sorry for that.
I will update this and a few other older posts. Please instead of enumerate
use coro::loop
, like so:
coro::loop(for (b in train_dl) {
optimizer$zero_grad()
output <- model(b[[1]]$to(device = "cuda"))
})
This is an instance of nondeterministic behavior not inherently due to torch, and does not happen when using coro
for iteration.
Thank you, I can confirm the modified code works fine.
My adapted main epoch
loop looks now (with 2 extra epochs) like:
for (epoch in 1:7) {
l <- c()
coro::loop(for (b in train_dl) {
# make sure each batch's gradient updates are calculated from a fresh start
optimizer$zero_grad()
# get model predictions
output <- model(b[[1]]$to(device = "cuda"))
# calculate loss
loss <- nnf_cross_entropy(output, b[[2]]$to(device = "cuda"))
# calculate gradient
loss$backward()
# apply weight updates
optimizer$step()
# track losses
l <- c(l, loss$item())
})
cat(sprintf("Loss at epoch %d: %3f\n", epoch, mean(l)))
}
and when run I get:
Loss at epoch 1: 0.410523
Loss at epoch 2: 0.205173
Loss at epoch 3: 0.154265
Loss at epoch 4: 0.130127
Loss at epoch 5: 0.106498
Loss at epoch 6: 0.092929
Loss at epoch 7: 0.081178
An extra comment - having modified the following code chunks as well I do get a lot better accuracy:
test_losses <- c()
total <- 0
correct <- 0
# see above
coro::loop(for (b in train_dl) {
output <- model(b[[1]]$to(device = "cuda"))
labels <- b[[2]]$to(device = "cuda")
loss <- nnf_cross_entropy(output, labels)
test_losses <- c(test_losses, loss$item())
# torch_max returns a list, with position 1 containing the values
# and position 2 containing the respective indices
predicted <- torch_max(output$data(), dim = 2)[[2]]
total <- total + labels$size(1)
# add number of correct classifications in this batch to the aggregate
correct <- correct + (predicted == labels)$sum()$item()
} )
mean(test_losses)
[1] 0.01363155
test_accuracy <- correct/total
test_accuracy
[1] 0.9961
as far as I understand your code, this is the accuracy of the test data set. If so - now we get a nearly perfect accuracy, don't we?
trying to run the code chunks from the blog post:
https://github.com/rstudio/ai-blog/tree/master/_posts/2020-09-29-introducing-torch-for-r
I get reproducible fails at the central code chunk:
Trying this with packages
torch
andtorchvision
installed today fromgithub
as suggested at the beginning of the blog.R version:
R Studio 1.4.1623 (from the dailies)
rsession using cuda properly has been verified by:
and
I am do not know enough about all this to try to debug that issue by myself. I think this native torch in R package is a great way to get up to date NNs going in R. If my issue with the sample code is a general problem, it might hamper the success of the package, as it is a big hurdle for starting a torch in R journey.