wzzheng / HDML

Implementation of Hardness-Aware Deep Metric Learning (CVPR 2019 Oral) in Tensorflow.
151 stars 28 forks source link

some questions about train and evaluate my own dataset #13

Open Julymycin opened 5 years ago

Julymycin commented 5 years ago

fisrtly, thx for your sharing codes. I tried to train my own data(8 classes,totally 20000 images), I convert it to hdf5 file and run training step successfully, but recall@1 metric seems errant. my question is why batch_size cant be larger than num_classes*2?

phambao commented 5 years ago

Hi, how could you convert your own datasets? Could u share it with me?

Julymycin commented 5 years ago

Hi, how could you convert your own datasets? Could u share it with me?

it is the converter codes: please pay attention to the lines end with xxxxxxxx img_path is your dataset image folder, label.txt file is like "img_name label": xxxxxxxx.jpg 2 xxxxxfdm.jpg 0 after hdf5 file has been created, you should code a py like cars196_dataset.py,then you can load your own dataset data_provider.py.

` import os import subprocess import numpy as np from scipy.io import loadmat import h5py import fuel from fuel.datasets.hdf5 import H5PYDataset import cv2 from tqdm import tqdm

def preprocess(hwc_bgr_image, size): hwc_rgb_image = cv2.cvtColor(hwc_bgr_image, cv2.COLOR_BGR2RGB) resized = cv2.resize(hwc_rgb_image, (size)) chw_image = np.transpose(resized, axes=(2, 0, 1)) return chw_image

if name=='main': img_path='/xxxxxxx' m=np.loadtxt('fi_trainimg_label.txt',object)#xxxxxxxx n=np.loadtxt('fi_testimg_label.txt',object)#xxxxxxx total=np.concatenate((m,n),axis=0) jpg_filenames=total[:,0] class_labels=total[:,1] class_labels=list(class_labels) num_examples=len(total)

# open hdf5 file
hdf5_filename = "fi.hdf5"
hdf5_filepath = os.path.join('.', hdf5_filename)
hdf5 = h5py.File(hdf5_filepath, mode="w")

image_size = (256, 256)
array_shape = (num_examples, 3) + image_size
ds_images = hdf5.create_dataset("images", array_shape, dtype=np.uint8)
ds_images.dims[0].label = "batch"
ds_images.dims[1].label = "channel"
ds_images.dims[2].label = "height"
ds_images.dims[3].label = "width"

for i, filename in tqdm(enumerate(jpg_filenames), total=num_examples,
                        desc=hdf5_filepath):
    try:
        raw_image = cv2.imread(os.path.join(img_path, filename), cv2.IMREAD_COLOR)  # BGR image
    except IOError:
        print(filename)
    image = preprocess(raw_image, image_size)
    ds_images[i] = image

targets = np.array(class_labels, np.int32).reshape(num_examples, 1)
ds_targets = hdf5.create_dataset("targets", data=targets)
ds_targets.dims[0].label = "batch"
ds_targets.dims[1].label = "class_labels"

# specify the splits (labels 1~98 for train, 99~196 for test)
# test_head = class_labels.index(len(m)+1)
# split_train, split_test = (0, test_head), (test_head, num_examples)
split_train, split_test = (0, len(m)+1), (len(m)+1, num_examples)
split_dict = dict(train=dict(images=split_train, targets=split_train),
                  test=dict(images=split_test, targets=split_test))
hdf5.attrs["split"] = H5PYDataset.create_split_array(split_dict)

hdf5.flush()
hdf5.close()`
phambao commented 5 years ago

thanks for your reply. I really appreciate it!

wzzheng commented 4 years ago

Hi, very sorry for the late reply. I guess you are using the N-pair loss. The bach size limitation is because that the N-pair loss needs to sample a tuple from (batch_size)/2 different classes. This is similar to the case of triplet loss. However, you can modify the setting of the triplet scheme by allowing the sampling of samples from the same classes in different triplets. Also, I'm not sure about the details of your models, but metric learning usually shows more advantages for cases where the number of classes is large.