Closed Carl0smvs closed 2 weeks ago
@sachinprasadhs Were you able to look into this issue in the meantime?
it's a pretty long code to read, but at a glance and given that you detect similar datasets being passed to fit
:
note that tf.data.Dataset
isn't being shuffled here; if you do, you'd need a very large buffer given that there is 500 items with the same image.
errors may still show up afterwards, then i'd try to follow the tutorial make sure that it runs, and use PyDataset to test any additions, otherwise debugging it's very hard.
I have tried shuffling the data before passing it into the tf.Data.Dataset generator, but it showed the same behavior. I have followed the tutorials wherever I could, but every tutorial/documentation I found regarding Siamese networks only uses in-memory arrays of dataset instead of tf.data.Dataset (which is a requirement in this case due to the size of my real dataset)
@Carl0smvs
I replaced the generator of pairs by a generator adapted from the make_pairs
in the docs.
Results seem fine:
Original code likely needs double checking how the pairs are generated (for the generator case.), there is likely a mistake there.
Okay I will check it again and come back to you. Thank you for your time!
This issue is stale because it has been open for 14 days with no activity. It will be closed if no further activity occurs. Thank you.
This issue was closed because it has been inactive for 28 days. Please reopen if you'd like to work on this further.
I am trying to train a custom built Siamese Network, following the Keras documentation closely, only modifying the architecture and other things as needed.
What I'm trying to do differently is load the data in a different manner. Due to the dimensions of the dataset I will be working with, I can't store it all at once in arrays in memory as is done in the example.
I tried loading it iteratively to the model to be trained in two ways:
Using a tf.data.Dataset that will load the data as needed Building a custom training loop that would only fetch the data batch by batch To test these, I put together an example script (supplied below) to test these approaches and how they perform in comparison to supplying the whole dataset to the model from an array in memory
Surprisingly, both of my methods produce a weird behaviour, where the trained model simply outputs the same class for every instance (classifying the image pairs as being distinct), while with the first method the model trains well and converges to a good solution. I don't understand if I'm doing something wrong, I have checked my produced tf.data.Dataset and they have the exact same data as the in-memory arrays (as it's supposed to) and I am doing things correctly according to the documentation as far as I am aware. Can you reproduce the issue, and if yes, understand if I'm doing something wrong or if this behaviour is related to something internal of Tensorflow/Keras that I am not aware of?
I'm using Tensorflow 2.15.1 with Python 3.10, tested both in my OS Linux Mint 21.2 Cinnamon and using a container from Ubuntu Image
Code:
Logs: