gabrielwong159 / siamese

One-shot learning for image classification using Siamese neural networks
35 stars 12 forks source link

How to use on custom model? #2

Open vainaixr opened 5 years ago

vainaixr commented 5 years ago

I am using the moeimouto-faces dataset, Dataset - https://www.kaggle.com/mylesoneill/tagged-anime-illustrations/home

how to run on this dataset?

gabrielwong159 commented 5 years ago

You can re-use the code for the Omniglot dataset. You would need to change the functions get_files and _parse_function in omniglot/data_loader.py.

The output of the function get_files is expected to be a list of list of strings as such:

[
    ['path/to/class1_image1.png', 'path/to/class1_image2.png', ...],
    ['path/to/class2_image1.png', 'path/to/class2_image2.png', ...],
    ['path/to/class3_image1.png', 'path/to/class3_image2.png', ...],
    ...
]

_parse_function currently expects a path to a PNG image (see tf.image.decode_png). It also expects a black-on-white image, hence the tf.bitwise.invert. Edit these functions to take in your 3-channel JPGs.

Finally, change the configurations by editing the corresponding flags in siamese/flags.py for parameters like image size. Once you have made your changes, you can run main.py. Make sure that you have a folder called model or TensorFlow will not be able to save the model.

cd siamese/omniglot
mkdir model
python3 main.py
MaleshwarSastri commented 4 years ago

You can re-use the code for the Omniglot dataset. You would need to change the functions get_files and _parse_function in omniglot/data_loader.py.

The output of the function get_files is expected to be a list of list of strings as such:

[
    ['path/to/class1_image1.png', 'path/to/class1_image2.png', ...],
    ['path/to/class2_image1.png', 'path/to/class2_image2.png', ...],
    ['path/to/class3_image1.png', 'path/to/class3_image2.png', ...],
    ...
]

_parse_function currently expects a path to a PNG image (see tf.image.decode_png). It also expects a black-on-white image, hence the tf.bitwise.invert. Edit these functions to take in your 3-channel JPGs.

Finally, change the configurations by editing the corresponding flags in siamese/flags.py for parameters like image size. Once you have made your changes, you can run main.py. Make sure that you have a folder called model or TensorFlow will not be able to save the model.

cd siamese/omniglot
mkdir model
python3 main.py

So by making these changes we can train our model based on our custom dataset. Im training model for logo detection. any other suggestions on how to proceed?

gabrielwong159 commented 4 years ago

Assuming each logo is a class, you can place samples of each logo in its own folder. An example folder structure could be as follows:

siamese/
└── omniglot/
    └── data/
        ├── logo1
        │   ├── logo1_1.png
        │   ├── logo1_2.png
        │   └── ...
        ├── logo2
        │   ├── logo2_1.png
        │   ├── logo2_2.png
        │   └── ...
        └── ...

Then, you could modify the get_files function in siamese/omniglot/data_loader.py:

def get_files(train, src='data'):
    logos =[join(src, f) for f in os.listdir(src)]
    files = [[join(dir_name, file_name) for file_name in os.listdir(dir_name)] for dir_name in logos]
    return files

This is expected to yield the following list:

[
    ['data/logo1/logo1_1.png', 'data/logo1/logo1_2.png', ...],
    ['data/logo2/logo2_1.png', 'data/logo2/logo2_2.png', ...],
    ...
]
alextaymx commented 3 years ago

how to make it work for color images?

  def get_files(train, src='rp2k_dataset'):
      if train:
          src = join(src, 'train')
      else:
          src = join(src, 'test')

      products = [join(src, f) for f in os.listdir(src)]
      files = [[join(p, f) for f in os.listdir(p)] for p in products]
      return files

  def _parse_function(f1, f2, label):
      def file_to_img(f):
          image_string = tf.read_file(f)
          image_decoded = tf.image.decode_image(image_string, 3)
          image_inverted = tf.bitwise.invert(image_decoded, )
          image_resized = tf.reshape(image_inverted, (FLAGS.h, FLAGS.w, FLAGS.c))
          return image_resized / 255

      im1, im2 = map(file_to_img, [f1, f2])
      return im1, im2, label
alextaymx commented 3 years ago

I'm having trouble to get this run properly >< File "main.py", line 39, in train assert not np.isnan(loss), 'Model diverged with loss = NaN' AssertionError: Model diverged with loss = NaN