Closed jonbry closed 5 months ago
There is a new API for preprocessing applications. Please take a look at ?keras3::application_preprocess_inputs()
Thanks for directing me to the right place! I think I was able to adjust the code to work with the new API, but I don't think it was necessarily the best method. This was for Listing 8.20 in Chapter 8. I did have one question that I couldn't figure out an answer to:
Does application_preprocess_inputs()
work with R arrays? I couldn't get it to work so I kept them as tensors, which I did eventually get to work. From the documentation, I thought it may work with R arrays since the data-format
argument mentions tensors/arrays, but I wasn't sure since the example used tensors.
Thanks for all your help!
application_preprocess_inputs()
work with R arrays?
I would expect it to take R arrays as input, and return tensors or numpy arrays. I'll take a look at this also when I look at #15.
Thanks for reporting.
I would expect it to take R arrays as input, and return tensors or numpy arrays
I looked into this issue a bit further because I couldn't understand why it wasn't working with R arrays. I'm not sure I'm doing this right, but I think my issue is caused by how the tensors are converted to R arrays. Here is the example code from the book with application_preprocess_input()
swapped for imagenet_preprocess_input()
:
iterator <- as_array_iterator(dataset)
for (i in 1:n_batches) {
# Assign the images and labels from the next batch
c(images, labels) %<-% iter_next(iterator)
# Preprocess images with respect to the model
preprocessed_images <- application_preprocess_inputs(model, images)
This gives an error: ValueError: output array is read-only
. I think I was able to recreate an example that generates the same issue and shows where it may be coming from:
library(keras3)
library(tensorflow)
#>
#> Attaching package: 'tensorflow'
#> The following objects are masked from 'package:keras3':
#>
#> set_random_seed, shape
library(tfdatasets)
#>
#> Attaching package: 'tfdatasets'
#> The following object is masked from 'package:keras3':
#>
#> shape
library(reticulate)
conv_base <- application_vgg16(weights = "imagenet",
include_top = FALSE)
freeze_weights(conv_base)
# Cases
r_array <- array(dim = c(100, 100, 3), sample(0:255, replace = T))
r_array_to_tensor <- as_tensor(r_array)
r_array_loop <- as.array(r_array_to_tensor)
tensor <- tf$ones(shape(100, 100, 3))
tensor_to_r_array <- as.array(tensor)
tensor_int <- tf$ones(shape(100, 100, 3), dtype = "int32")
tensor_int_to_r_array <- as.array(tensor_int)
# Processed tensors/arrays
processed_r_array <- application_preprocess_inputs(conv_base, r_array)
processed_r_array_to_tensor <- application_preprocess_inputs(conv_base, r_array_to_tensor)
processed_r_array_loop <- application_preprocess_inputs(conv_base, r_array_loop)
processed_tensor <- application_preprocess_inputs(conv_base, tensor)
processed_tensor_to_r_array <- application_preprocess_inputs(conv_base, tensor_to_r_array)
#> output array is read-only
processed_tensor_int_to_r_array <- application_preprocess_inputs(conv_base, tensor_int_to_r_array)
# Types
typeof(r_array)
#> [1] "integer"
typeof(r_array_loop)
#> [1] "integer"
typeof(tensor_to_r_array)
#> [1] "double"
typeof(tensor_int)
#> [1] "environment"
typeof(tensor_int_to_r_array)
#> [1] "integer"
# Shapes
r_array_to_tensor$dtype
#> tf.int32
tensor$dtype
#> tf.float32
tensor_int$dtype
#> tf.int32
Created on 2024-04-13 with reprex v2.1.0
It turns out that application_preprocess_inputs()
does in fact accept R arrays and tensors, but it doesn't like R arrays that are doubles. I was able to take an R array, convert it to a tensor, and then back to an R array (r_array_loop
) and it had no problem. It also worked if I specified dtype = "int32"
when creating the tensor but not with the default value.
I noticed in the example in the book that image_dataset_from_directory()
created Prefetch Dataset objects with dtype = float32
. Could that be what's causing the problem since application_preprocess_inputs()
doesn't seem to like R arrays that are double
, which was the output from as_array_iterator()
? If so, is there a way that I can change the type for the images? I saw that there's an option to change labels to int32
but I didn't see one for the images.
I'm not sure if this is helpful, but I was trying to make a basic reprex for this issue and of course the first R array worked. I ended up going down a rabbit hole and thought it may be useful for troubleshooting. Let me know if there's any additional information I can provide.
Thanks!
Sorry, I forgot to include the full error message information:
Error in py_call_impl(callable, call_args$unnamed, call_args$named) : ValueError: output array is read-only
── Python Exception Message ──────────────────────────────
Traceback (most recent call last):
File "/Users/<user_name>/.virtualenvs/r-tensorflow/lib/python3.10/site-packages/keras/src/applications/vgg16.py", line 231, in preprocess_input
return imagenet_utils.preprocess_input(
File "/Users/<user_name>/.virtualenvs/r-tensorflow/lib/python3.10/site-packages/keras/src/applications/imagenet_utils.py", line 104, in preprocess_input
return _preprocess_numpy_input(x, data_format=data_format, mode=mode)
File "/Users/<user_name>/.virtualenvs/r-tensorflow/lib/python3.10/site-packages/keras/src/applications/imagenet_utils.py", line 224, in _preprocess_numpy_input
x[..., 0] -= mean[0]
ValueError: output array is read-only
── R Traceback ───────────────────────────────────────────
▆
1. └─keras3::application_preprocess_inputs(conv_base, tensor_to_r_array)
2. └─reticulate (local) preprocess_input(x, data_format = data_format, ...)
3. └─reticulate:::py_call_impl(callable, call_args$unnamed, call_args$named)
>
ValueError: output array is read-only
When an R array gets converted to Python, it is not automatically copied. Reticulate instead constructs a Numpy array object, which is just a pointer to the data of the R Array. Because the Numpy array does not "own" the data, it is marked as not-writeable.
The preprocess_input()
function wants to modify the provided numpy array in place (i.e., a writeable array), and they note that if you don't want that, you can call x.copy()
directly
https://github.com/keras-team/keras/blob/b267f939b70970e38eec20e3977a1bf41cec8cdd/keras/applications/imagenet_utils.py#L40
library(reticulate)
x <- r_to_py(array(c(1, 2, 3, 4), c(2, 2)))
x$flags
#> C_CONTIGUOUS : False
#> F_CONTIGUOUS : True
#> OWNDATA : False
#> WRITEABLE : False
#> ALIGNED : True
#> WRITEBACKIFCOPY : False
x$flags$writeable
#> False
x2 = x$copy()
x2$flags
#> C_CONTIGUOUS : True
#> F_CONTIGUOUS : False
#> OWNDATA : True
#> WRITEABLE : True
#> ALIGNED : True
#> WRITEBACKIFCOPY : False
x2$flags$writeable
#> True
Created on 2024-04-13 with reprex v2.1.0
Note that the copy operation also changed the layout of the data, from F_CONTIGUOUS
to C_CONTIGUOUS
. Most Python code expects a C array, so that's good.
Instead of calling x$copy()
, we can also use reticulate::np_array()
, to do the same.
I consider this a bug in keras3, the R function application_preprocess_inputs
should take care of these details and do the copy operation as needed.
I verified that this now works with the new keras3
package (0.2.0). Thanks for all of your help!
I'm working through the fast feature extraction example in Chapter 8, and I seem to be stuck at the function for extracting features from the datasets. The function uses
imagenet_preprocess_input()
, but it doesn't seem to be included in thekeras3
package. It wasn't autocompleting nor could I find it usingls(package:keras3)
. I was able to find it in the documentation forkeras 2.1.3
.Here's the error:
Is
imagenet_preprocess_input()
not included in thekeras3
package? If so, is there an alternative method I can use to get the same result? Let me know if there is any additional information I can provide.Thank you!