Closed LukeMathWalker closed 7 years ago
Just a thought, but without any kind of checks (e.g. negative check, left/right must be less than right/bottom) you can wind up with a slice like image[:, 5:2, 6:-1, :]
which would wind up with a generic shape of (None, 0, 714, 3)
.
I don't think that has to do with your error, as the above shape with a 0 dimension is valid. But you should check out some examples of the Merge layer, especially with a custom mode, because at the very least you will also need to supply a function for output_shape
since you are using a custom mode.
I'd recommend getting it working without doing the cropping first (e.g. just return the image in your merger) as you'll need to do the output_shape
function as well, and then add in the actual cropping.
Actually, now that I'm thinking about it, you're going to run into issues whenever you have batch_size
> 1. Tensors are essentially multidimensional matrices, so you can't have subtensors of mismatched dimensions. In other words, you can't really combine tensors of shapes (851, 254, 3) and (498, 387, 3) automatically, you would have to pad up to (851, 387, 3) and probably keep a reference list of the valid areas; so depending on what you are trying to do, it will probably just be better to keep the full images plus the diagonal corner info, and not do any cropping. But again, that's dependent on what you are trying to accomplish.
Of course I'll have to implement checks to get a valid window (left corner to the left of the right corner, etc.) but that's not the problem here because the error I posted was raised before any attempt of training, that's why I have not supplied an output_shape function in my issue.
But actually I had not foreseen the batch_size problem... Getting around it could be complicated.
Fair enough, but I still recommend starting smaller.
That said, your error is ValueError: ('TensorType could not be cast to have 0 dimensions', TensorType(float32, vector))
. This says that a float32 vector could not be cast to a 0 dimensional item. Looking at the code, and simplifying to a minimal slice attempt:
def merger(l):
image = l[1]
indexes = l[0]
index_0 = indexes[:,0]
return image[:, index_0:, :, :]
..still throws this error. We can now say that slicing by index_0 is the issue; index_0 is a slice of shape (None, 1), which is a vector. Trying to slice by a vector is nonsensical in Theano, as it expects the slicer index (index_0) to be an integer, which is a 0 dimensional item. So, the error in English is saying: "Expected an integer, got a vector of floats; cannot cast a vector of floats to an integer.
In other words, your issues are:
So, your options are:
batch_size
of 1, and can therefore index using 4 integers. There would be no issues with matrices, dimensions, etc.I understood the issue and I'm going to follow you advice. Thanks for the patience, I learned something in the process! ;D
Anyway, just for the sake of completeness, I conceived the following workaround:
def merger(l):
image = l[1]
indexes = T.tensor.iround(l[0])
index_0 = indexes[:,0]
index_1 = indexes[:,1]
index_2 = indexes[:,2]
index_3 = indexes[:,3]
nb_of_samples = T.tensor.shape(index_0)[0]
cropped_image = T.tensor.zeros_like(image)
for i in range(nb_of_samples):
cropped_image = T.tensor.setsubtensor(cropped_image[i,min(index_0[i],index_1[i]):max(index_0[i],index_1[i])+1,min(index_2[i],index_3[i]):max(index_2[i],index_3[i])+1,:], image[i,min(index_0[i],index_1[i]):max(index_0[i],index_1[i])+1,min(index_2[i],index_3[i]):max(index_2[i],index_3[i])+1,:])
return cropped_image
which almost does what I expect it to do. In fact
'TensorVariable' object cannot be interpreted as an integer
even though the index_j[i] is TensorVariable with dtype=int64 and scalar dimension.
You've switched from Symbolic computing (Theano) to python computing via the use of a for loop; python works with Integers/Floats/etc. and Theano works with Tensors. You'll need to either convert everything to integers/numpy arrays, do your loop on the CPU and convert back to Theano tensors (which will in all likelihood break the graph and autodifferentiation) or use the Theano looping functionality via scan
.
If the number of iterations of the loop is small and fixed, then you can use the for loop to build a bigger Theano graph and don't use scan.
Le 9 janv. 2017 11:48, "Pat York" notifications@github.com a écrit :
You've switched from Symbolic computing (Theano) to python computing via the use of a for loop; python works with Integers/Floats/etc. and Theano works with Tensors. You'll need to either convert everything to integers/numpy arrays, do your loop on the CPU and convert back to Theano tensors (which will in all likelihood break the graph and autodifferentiation) or use the Theano looping functionality via scan.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/fchollet/keras/issues/4966#issuecomment-271337370, or mute the thread https://github.com/notifications/unsubscribe-auth/AALC-6kRoeZG3FbgKwCSB7sikwQeBcR2ks5rQmTEgaJpZM4Ldyfg .
@LukeMathWalker the trickier issue you are going to have is when you train. If your network outputs indexes, those are discrete steps, so backprop won't work.
If you want to be able to backprop so your network actually learns how to crop, you need to make everything differentiable. That means the crops are real numbers and you interpolate to get fractional cropping.
Depending on what you're trying to do, you could rescale all of the crops to the same size, and then you could do multiple crops in a batch.
I'm trying to implement the following architecture with Keras (Theano backend).
I have a first Sequential network (say S1) which takes an image as input and has 4 linear output, which do correspond to the upper-left and the bottom-right coordinates of a rectangular "window" in the input image which is supposed to contain the object I want to identify.
Once I have those four outputs... I'd like to actually crop the image!
So I thought to take again my image as input in a new Sequential network (say S2) and merge these two networks using a Merge Layer with function mode.
I turned out to be a bit more complicated than I expected and I'm stuck with some Theano errors I can't get rid of.
Here's the relevant part of the code:
And here you have the error:
What's going on?