Open AlexanderLutsenko opened 8 months ago
Hi @AlexanderLutsenko ,
Replicated the reported issue with Keras3. Keras2 works fine. Attached gist for reference.
May need to check whether this is a bug or design changes. Thanks
cc: @VarunS1997
Hi @AlexanderLutsenko, I agree this is a pain point with Keras, working with different tensors we haven't yet standardized an indexing method. Ideally we want users to be able to create an ops.array()
and have access to indexing in all backends. We are actively exploring solutions to this.
We are aiming to have an operator where for example, the corresponding Keras translation of the pytorch code you added would be y = ops.at(x)[:, l // 2]
. Note that for the time being this is experimental and work in progress.
@grasskin Thanks for the clarity!
One thing I still don't understand is why it works fine inside a custom layer:
class CustomLayer(keras.Layer):
def call(self, x):
b, l = keras.ops.shape(x)
y = x[:, l // 2]
return y
def TestModel():
x = keras.Input(batch_shape=(1, None))
y = CustomLayer()(x)
return keras.Model(inputs=x, outputs=y)
Can it be made to work without a custom layer? Is that on the to-do list?
Hello, thanks AlexanderLutsenko for raising this issue, yes, I am facing the same issues with keras3
Hi! Alexander's tool is an incredible breakthrough for anyone who wants to use models from different frameworks with Keras. It would be amazing to be able to update it with Keras3 and take full advantage of its potential, so please fix this bug. Thanks!
Hello, nobuco is an amazing tool to migrate pytorch model to keras. I hope this issue would be resolved well. Thank you!
+1
+1
Hi! Thank you guys for the better, cleaner new Keras! The promise of backend-agnostic models is just fantastic.
Problem is, the new framework seems to have lost some capabilities of its predecessor. Here's an example code which works perfectly in Keras 2:
Keras 2
I want the same in Keras 3 with minimal reliance on custom layers. Alas, the straightforward approach fails.
Keras 3
Custom layers do work, but I'd like to avoid them because otherwise I would need to supply
custom_objects
on model loading.Another drawback is that Keras layers (in contrast to e.g. Pytorch) do not allow arbitrary input signatures. In my example, generalizing
Slice
layer to the same degree as__getitem__
method is not possible.Why it matters
There's a tool that converts Pytorch models to Tensorflow using Keras 2, the approach proved successful. Now I'd like to take it a step further and bring the awesomeness of JAX to the Pytorch crowd.
The converter establishes a one-to-one correspondence between Pytorch modules/ops and the equivalent Keras implementations. Both have the same signatures, except for tensors which are framework-specific.
Below, we traced three Pytorch ops (
shape
,__floordiv__
and__getitem__
) which we then convert to Keras independently from each other. That is why I want generic__getitem__
in Keras 3.So, for the new Keras' perceived lack of flexibility. Is that considered a flaw or rather a deliberate design choice? Why do some ops only work when wrapped in a layer? Is there a work-around? Any help will be greatly appreciated.