Closed tonylincon1 closed 1 year ago
Something response?
Hi Tony,
Thanks for using TF Sim. I think the issue might be with how you are writing your tf record files. The TFRecordDatasetSampler uses the interleave function to randomly sample examples from K different tfrecord files, where K is equal to or greater than the number of classes you have in your dataset. Additionally, the length of each tfrecord file must be an integer multiple of the number of examples per class per batch.
Let me know if that unblocks you and see here for more details. https://github.com/tensorflow/similarity/issues/171 and https://github.com/tensorflow/similarity/issues/213
An alternative approach is to use the in memory sampler (or the new tf.data.Dataset sampler I'm working on in this branch). You can pass the URI to the images as the X values and then use the load_fn to read them per batch. See here for a working version using the MultiShotMemorySampler for images.
I've will try the MultiShotMemorySampler, Thank you for reponse xD
Thanks. Closing this for now but let us know if you run into any other issues.
Hello, how are you?
I really like the tensorflow similarity solution for making recommendations, however I am having a hard time using tfsim.samplers.TFRecordDatasetSampler as I have a lot of data to keep in memory.
I tried the following way to save ".tfrecords" files:
count = write_images_to_tfr_short(x_train, y_train, filename=f"{data_path}small_images")
From this I was able to save two files with my images and then write the unerecording function
When I try to use tfsim.samplers.TFRecordDatasetSampler the following error occurs