R 3D arrays incompatible with keras

CharlesWHarrison commented 3 years ago

**System information**.

> R version 3.6.3 (2020-02-29)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19043)

> attached base packages: [1] stats     graphics  grDevices utils     datasets  methods   base     

> other attached packages: [1] tensorflow_2.5.0 keras_2.4.0     

> loaded via a namespace (and not attached): 
[1] Rcpp_1.0.4.6    lattice_0.20-38 zeallot_0.1.0   rappdirs_0.3.3  grid_3.6.3      R6_2.4.1        jsonlite_1.7.2 
 [8] magrittr_2.0.1  tfruns_1.5.0    whisker_0.4     Matrix_1.2-18   reticulate_1.18 generics_0.1.0  tools_3.6.3    
[15] abind_1.4-5     compiler_3.6.3  base64enc_0.1-3

Describe the problem

Python's 3D array differs from R's 3D array. The first dimension of a Python 3D array is the number of arrays, the second is the number of rows in each array, and the third is the number of columns in each array. In R, a 3D array's first dimension is the number of rows, the second dimension is the number of columns, and the third is the number of matrices (arrays). Python's 3D array is consistent with keras, but R's 3D array does not seem to be consistent with keras and I think this is causing a problem. Reshaping the R 3D array allows keras to run without error, but the structure of the data is changed and is no longer consistent with the underlying problem.

Describe the current behavior.

library(keras)
library(tensorflow)
set.seed(1)
X <- array(data = rnorm(5*3*8), dim = c(5, 3, 8)) # ERROR, BUT HAS THE CORRECT STRUCTURE
# X <- array(data = X, dim = c(8, 5, 3)) # KERAS RUNS WITHOUT AN ERROR, BUT NOW THERE ARE ONLY 3 EXAMLES INSTEAD OF 8
Y <- matrix(rnorm(8), nrow = 8, ncol = 2)
dim(X)
dim(Y)
model <- keras_model_sequential()
model %>% keras::layer_lstm(units = 25, activation = 'relu', return_sequences = TRUE) %>%
  keras::layer_lstm(units = 10, activation = 'relu') %>%
  keras::layer_dense(units = 2, activation = 'linear')
model %>% compile(optimizer = "rmsprop",
                  loss = "mse",
                  metrics = c("mean_absolute_error"))
model %>% fit(X, Y, epochs = 10)

The code above produces the following error:

>  Error in py_call_impl(callable, dots$args, dots$keywords) : 
  ValueError: Data cardinality is ambiguous:
  x sizes: 5
  y sizes: 8
Make sure all arrays contain the same number of samples.

A 3D array in R has 3 dimensions i,j,k where i is the number of rows in each matrix, j is the number of columns in each matrix, and k is the number of matrices. Keras expects the X input to have three dimensions as well, but there seems to be a conflict between R and keras. The keras library expects i to be the number of matrices (R uses number of rows in each matrix), j to be the number of rows (R uses number of columns), and k to be the number of columns (R uses number of matrices).

In the example above, I have 8 samples, 5 time steps, and 3 features, so my X has dimension 5 x 3 x 8 in R. My response Y has dimension 8 (rows) by 2 (columns). However, keras expects X to have dimension 8 x 5 x 3.

Attempted Solution To fix the problem, I thought I could just reshape the data using the code below. This code reshapes the array to be 8 x 5 x 3, but 8 x 5 x 3 in R has a different meaning from that of keras. In this case, 8 x 5 x 3 in R means that there are only 3 samples where each sample has 8 rows and 5 columns.

`X <- array(data = X, dim = c(8, 5, 3)) # KERAS RUNS WITHOUT AN ERROR, BUT NOW THERE ARE ONLY 3 EXAMLES INSTEAD OF 8`

> , , 1

           [,1]        [,2]        [,3]        [,4]        [,5]
[1,] -0.6264538  0.57578135 -0.01619026  0.61982575  0.38767161
[2,]  0.1836433 -0.30538839  0.94383621 -0.05612874 -0.05380504
[3,] -0.8356286  1.51178117  0.82122120 -0.15579551 -1.37705956
[4,]  1.5952808  0.38984324  0.59390132 -1.47075238 -0.41499456
[5,]  0.3295078 -0.62124058  0.91897737 -0.47815006 -0.39428995
[6,] -0.8204684 -2.21469989  0.78213630  0.41794156 -0.05931340
[7,]  0.4874291  1.12493092  0.07456498  1.35867955  1.10002537
[8,]  0.7383247 -0.04493361 -1.98935170 -0.10278773  0.76317575

, , 2

           [,1]       [,2]        [,3]       [,4]         [,5]
[1,] -0.1645236 -0.1123462 -0.36722148 -0.7432732  0.610726353
[2,] -0.2533617  0.8811077 -1.04413463  0.1887923 -0.934097632
[3,]  0.6969634  0.3981059  0.56971963 -1.8049586 -1.253633400
[4,]  0.5566632 -0.6120264 -0.13505460  1.4655549  0.291446236
[5,] -0.6887557  0.3411197  2.40161776  0.1532533 -0.443291873
[6,] -0.7074952 -1.1293631 -0.03924000  2.1726117  0.001105352
[7,]  0.3645820  1.4330237  0.68973936  0.4755095  0.074341324
[8,]  0.7685329  1.9803999  0.02800216 -0.7099464 -0.589520946

, , 3

           [,1]       [,2]        [,3]       [,4]       [,5]
[1,] -0.5686687  0.3700188 -1.27659221 -0.6545846  1.4322822
[2,] -0.1351786  0.2670988 -0.57326541  1.7672873 -0.6506964
[3,]  1.1780870 -0.5425200 -1.22461261  0.7167075 -0.2073807
[4,] -1.5235668  1.2078678 -0.47340064  0.9101742 -0.3928079
[5,]  0.5939462  1.1604026 -0.62036668  0.3841854 -0.3199929
[6,]  0.3329504  0.7002136  0.04211587  1.6821761 -0.2791133
[7,]  1.0630998  1.5868335 -0.91092165 -0.6357365  0.4941883
[8,] -0.3041839  0.5584864  0.15802877 -0.4616447 -0.1773305

t-kalinowski commented 3 years ago

Hi, thanks for filing. You are correct that R arrays use a different ordering from default numpy arrays (Fortran ordering vs C ordering). However, array ordering is, at least for keras/tensorflow, independent from how array dimensions are interpreted. When you call fit(X), where dim(X) == c(8, 5, 3), then keras understands that you've provided a batch with 8 cases, 5 timesteps, and 3 features per timestep. This is regardless of if X is Fortran ordered or C ordered under the hood.

CharlesWHarrison commented 3 years ago

Hello, thank you very much for your reply. I understand now, so thank you. Also, perhaps this should be documented as it differs from keras' behavior in Python.

t-kalinowski commented 3 years ago

Hi, thanks.

What do you mean it differ's from the behavior in Python? As far as I know, if you construct a Fortran-ordered numpy array and pass it to fit, it is interpreted the same way as a C-ordered array, both in R and python.

CharlesWHarrison commented 3 years ago

Sorry for the confusion. I mean that the R array passed to keras looks different from the equivalent Python array, and this may be confusing to some R users.

rstudio / keras3

R 3D arrays incompatible with keras #1262