How to prepare dataset?

kkkris7 commented 2 days ago

Hi, thank you for sharing the code!

I'm wondering how to prepare the dataset if I want to train my own, especially how to prepare the labels. I noticed the txt file from chest dataset contains all the points and its location information. But all the location information are within 0~1 range. Could you share more details on how to prepare them?

Many thanks!

heqin-zhu commented 2 days ago

Hi there!

You can prepare dataset same as the existing dataset, such as Hand, Chest. The image and label of dataset are placed in data folder, while the codes for processing the image and label are lied in universal_landmark_detection/model/datasets, in which you should modify the readLandmark function.

After preparing the image, label and processing codes, you should modify the config.yaml, __init__.py and main.py.

If you have any further question, feel free to contact me.

kkkris7 commented 12 hours ago

Thank you, Heqin, for your response and all the helpful instructions.

I’m specifically interested in the initial step before placing the data into the data folder. What data format does the code require? And should I normalize all coordinates based on each image size?

heqin-zhu commented 3 hours ago

You can prepare it whatever you like. After placing the data, you just need to modify/rewrite the readLandmark function (together with readImage) to process your own landmarks and images. Here is the reference code of these functions:

    def readLandmark(self, name, origin_size):
        li = list(self.labels.loc[int(name), :])
        r1, r2 = [i/j for i, j in zip(self.size, origin_size)]
        points = [tuple([round(li[i]*r1), round(li[i+1]*r2)])
                  for i in range(0, len(li), 2)]
        return points

    def readImage(self, path):
        '''Read image from path and return a numpy.ndarray in shape of cxwxh
        '''
        img = Image.open(path)
        origin_size = img.size

        # resize, width x height,  channel=1
        img = img.resize(self.size)
        arr = np.array(img)
        # channel x width x height: 1 x width x height
        arr = np.expand_dims(np.transpose(arr, (1, 0)), 0).astype(np.float)
        # conveting to float is important, otherwise big bug occurs
        for i in range(arr.shape[0]):
            arr[i] = (arr[i]-arr[i].mean())/(arr[i].std()+1e-20)
        return arr, origin_size

MIRACLE-Center / YOLO_Universal_Anatomical_Landmark_Detection

How to prepare dataset? #24