DaChro / ogh_summer_school_2020

Material for the session "Introduction to Deep Learning in R for the analysis of UAV-based remote sensing data"
20 stars 18 forks source link

ensor object has no attribute ‘numpy’ #1

Closed Kalondepatrick closed 2 years ago

Kalondepatrick commented 2 years ago

I am learning to build convolution neural networks for analyzing drone imagery following this using the set of codes and functions presented here. However, there is a small problem especially when I run tensor_slice_dataset() I get an error indicating that the tensor object has no attribute ‘numpy’

In search for a solution, one tutorial that I found online indicated that soon after loading tensoflow library, I have to function enable_eager_execution(). However when I do that my r session is immediately terminated on the condition that R encountered a fatal error

From the attempts that I have done so far, here is the interesting thing: 'the problem only occurs on my windows computer, and not on my mac'.

On my Mac, that step run smoothly with no errors and I am only having problems when I want to use my tfdatasets as training_dataset <- dataset_map(training_dataset, function(.x) list_modify(.x, img = tf$image$decode_jpeg(tf$io$read_file(.x$img)))). I am getting the error Error in py_call_impl(callable, dots$args, dots$keywords) : RuntimeError: in user code.

DaChro commented 2 years ago

Hi Patrick, eager execution should be enabled by default, so I don´t think this will solve your problem. Could you please give a minimal example to reproduce your error when using tensor_slices_dataset()?

Kalondepatrick commented 2 years ago

Hi Patrick, eager execution should be enabled by default, so I don´t think this will solve your problem. Could you please give a minimal example to reproduce your error when using tensor_slices_dataset()?

#############################################

Setting up the environment

############################################

Libraries

library(keras) library(tensorflow) library(tfdatasets) library(purrr) library(ggplot2) library(rsample) library(stars) library(raster) library(reticulate) library(mapview)

##################################################

Building the Model

#################################################

first_model <- keras_model_sequential() layer_conv_2d(first_model, filters = 32, kernel_size = 3, activation = "relu", input_shape = c(128,128,3)) layer_max_pooling_2d(first_model, pool_size = c(2,2)) layer_conv_2d(first_model, filters = 64, kernel_size = c(3,3), activation = "relu") layer_max_pooling_2d(first_model, pool_size = c(2,2)) layer_conv_2d(first_model, filters = 128, kernel_size = c(3,3), activation = "relu") layer_max_pooling_2d(first_model, pool_size = c(2,2)) layer_conv_2d(first_model, filters = 128, kernel_size = c(3,3), activation = "relu") layer_max_pooling_2d(first_model, pool_size = c(2,2)) layer_flatten(first_model) layer_dense(first_model, units = 256, activation = "relu") layer_dense(first_model, units = 1, activation = "sigmoid")

The model

summary(first_model)

##################################################

Preparing the data

#################################################

Getting all file paths containing our targets

subset_list <- list.files("./training/true", full.names = T)

Create a dataframe with two columns: file paths and labels (1)

data_true <- data.frame(image=subset_list, lbl=rep(1L, length(subset_list)))

Getting all file paths containing non-targets

subset_list <- list.files("./training/false", full.names = T)

Create a dataframe with two columns: file paths and labels (1)

data_false <- data.frame(image=subset_list, lbl=rep(0L, length(subset_list)))

Merge the two dataframes

data <- rbind(data_true, data_false)

Randonly split data into 75 percent training and 25 percent testing. The split should be done proportional of the two categories

set.seed(2020) data <- initial_split(data, prop = 0.75, strata = "lbl")

Looking at the data

data head(training(data)) c(nrow(training(data)[training(data)$lbl==0,]), nrow(training(data)[training(data)$lbl==1,])) #Check equal split #That is 0's and 1's

##########################################################

WINDOWS FATAL ERROR

OCCUR WITH THE NEXT FUNCTION

##########################################################

training_dataset <- tensor_slices_dataset(training(data))

A List of all tensors

dataset_iterator <- as_iterator(training_dataset) dataset_list <- iterate(dataset_iterator) head(dataset_list)

Get shape of the first model

subset_size <- first_model$input_shape[2:3]

##########################################################

MACBOOK ERROR

OCCUR WITH THE NEXT FUNCTION

##########################################################

1 Convert the images to a float

training_dataset <- dataset_map(training_dataset, function(.x) list_modify(.x, img = tf$image$decode_jpeg(tf$io$read_file(.x$img))))

2 Convert data type

training_dataset<- dataset_map(training_dataset, function(.x) list_modify(.x, img = tf$image$convert_image_dtype(.x$img, dtype = tf$float32)))

Resize to the size expected by the model

training_dataset<- dataset_map(training_dataset, function(.x) list_modify(.x, img = tf$image$resize(.x$img, size = shape(subset_size[1], subset_size[1], subset_size[2]))))

The data has 0's then targets. Suffle the data

training_dataset<-dataset_shuffle(training_dataset, buffer_size = 10L*128)

Create batches for data processing

training_dataset<-dataset_batch(training_dataset, 10L)

training_dataset<-dataset_map(training_dataset, unname)

DaChro commented 2 years ago

I cannot reproduce the error in my host environment or the container. Creating the first dataset works fine in both. However, I found a flaw in your code that probably explains the second error you get on your mac: when you create the data_true and data_false data frames, you name the column containing the image paths "image", while later the function tf$io$read_file is looking for "img". I suggest you name the columns in the data frames "img" like in the tutorial in order for the rest of the script to work as expected.

Kalondepatrick commented 2 years ago

I cannot reproduce the error in my host environment or the container. Creating the first dataset works fine in both. However, I found a flaw in your code that probably explains the second error you get on your mac: when you create the data_true and data_false data frames, you name the column containing the image paths "image", while later the function tf$io$read_file is looking for "img". I suggest you name the columns in the data frames "img" like in the tutorial in order for the rest of the script to work as expected.

Thanks so much. Nice catch. It worked, except for the line where we are resizing to the size that is expected for the model valueError: images must have either 3 or 4 dimensions.

DaChro commented 2 years ago

can you please again send a minimal example leading to this error so I can have a look? thx!