akarshzingade / image-similarity-deep-ranking

369 stars 103 forks source link

About get_train_generator #21

Open ha121ppy opened 6 years ago

ha121ppy commented 6 years ago

Dear akarshzingade, when use the function get_train_generator,I notice the output is much larger then files generated by triplets.txt, the I noticed it copy all filenames in triplets.txt by each class. It is done by the following code in function DirectoryIterator in ImageDataGeneratorCustom.py. for dirpath in (os.path.join(directory, subdir) for subdir in classes): results.append(pool.apply_async(_list_valid_filenames_in_directory, (dirpath, white_list_formats, self.class_indices, follow_links,triplet_path)))

Since classes is not used later, here I adjust like this: for subdir in classes: dirpath=os.path.join(directory, subdir) then just use the last 'dirpath' to get filesnames: results.append(pool.apply_async(_list_valid_filenames_in_directory, (dirpath, white_list_formats, self.class_indices, follow_links,triplet_path)))

In my test it works, do you think it is proper? any other place need to adjust together? thanks a lot

fafa92 commented 6 years ago

Hi, I just found about this GitHub repo, I'm sort of confused about how it can be implemented, I have triplet file from their article which has links of query pictures, negatives and positive images. It contains 5033 different images, have you used that to train this model? or you already have data in different files? I cannot see anywhere so I can give that triplet.txt that contains links to this code and give those required data out of it, can you please clarify if you already have done that? and a little clarify how you train model via this code. Thank you so much in advance.

ha121ppy commented 6 years ago

@fafa92 , I generated my own triplets referring code in 'tripletSampler.py'. The adjust I mentioned above is to make sure triplets in generator are not duplicated