dahtah / imager

R package for image processing
GNU Lesser General Public License v3.0
186 stars 43 forks source link

Deep learning with imager error (deprecation in Dec 2015) #19

Closed slimthealien closed 8 years ago

slimthealien commented 8 years ago

https://www.r-bloggers.com/deep-learning-with-mxnetr/

In the classify real-world images with pre-trained model portion of this website, part of the code no longer works. This is due to the fact that imager was deprecated in December 2015 which somehow affected the API. There is supposedly a fix for this code, but I can't find it anywhere online. Can anyone help?

Specifically, I get the following messages:

model <- mx.model.load("Inception/Inception_BN", iteration=39) [08:32:51] d:\chhong\mxnet\src\operator./softmax_output-inl.h:292: Softmax symbol is renamed to SoftmaxOutput. This API will be deprecated in Dec, 2015

normed <- preproc.image(im, mean.img) Error: Expecting a four-dimensional array

Thanks!

dahtah commented 8 years ago

The deprecation message is from the mxnet package. However, there is indeed a problem with preproc.image that's related to a change in the imager API. Here's the fix:

preproc.image <-function(im, mean.image) {
  # crop the image
  shape <- dim(im)
  short.edge <- min(shape[1:2])
  yy <- floor((shape[1] - short.edge) / 2) + 1
  yend <- yy + short.edge - 1
  xx <- floor((shape[2] - short.edge) / 2) + 1
  xend <- xx + short.edge - 1
    ###REMOVED:croped <- im[yy:yend, xx:xend,,]
   cropped <- imsub(im,y >= yy, y <= yend,x >= xx, x <= xend)
  # resize to 224 x 224, needed by input of the model.
  resized <- resize(cropped, 224, 224)
  # convert to array (x, y, channel)
  arr <- as.array(resized)
  dim(arr) = c(224, 224, 3)
  # substract the mean
  normed <- arr - mean.img
  # Reshape to format needed by mxnet (width, height, channel, num)
  dim(normed) <- c(224, 224, 3, 1)
  return(normed)
}

You should let the author of the blog post know.

slimthealien commented 8 years ago

Thanks!! It does run now. However, I ran two vastly different images, and got the same results for each (as follows):

[1] "Predicted Top-classes: n04286575 spotlight, spot" "Predicted Top-classes: n03729826 matchstick"
[3] "Predicted Top-classes: n03916031 perfume, essence" "Predicted Top-classes: n04525038 velvet"
[5] "Predicted Top-classes: n04548280 wall clock"

Is there anything else that needs to be done to optimize the code?

dahtah commented 8 years ago

I really can't say, sorry, I know nothing at all about mxnet. You should really take it up with the authors.

slimthealien commented 8 years ago

Will do, thanks for the code help regardless.