cxliu0 / PET

[ICCV 2023] Point-Query Quadtree for Crowd Counting, Localization, and More
MIT License
53 stars 5 forks source link

Each epoch mae and mse is the same number #23

Closed mokby closed 3 weeks ago

mokby commented 3 weeks ago

Hey guys, I modify the SHA.py to read my own datasets, but during the training process, the mae and mse of each epoch are the same. Which caused the program did not save the newer models. What may caused this problem? Here is the log

epoch:20, mae:43.07692307692308, mse:50.23795939429232, time315.07877230644226, 
best mae:43.07692307692308, best epoch: 5
epoch:25, mae:43.07692307692308, mse:50.23795939429232, time315.6004457473755, 
best mae:43.07692307692308, best epoch: 5
epoch:35, mae:43.07692307692308, mse:50.23795939429232, time312.109171628952, 
best mae:43.07692307692308, best epoch: 5
epoch:50, mae:43.07692307692308, mse:50.23795939429232, time314.3498001098633, 
best mae:43.07692307692308, best epoch: 5
epoch:80, mae:43.07692307692308, mse:50.23795939429232, time315.116548538208, 
best mae:43.07692307692308, best epoch: 5
epoch:95, mae:43.07692307692308, mse:50.23795939429232, time314.9769003391266, 
best mae:43.07692307692308, best epoch: 5

And the mainly change in SHA.py is I add dataset format 'custome' which like:

class CUSTOME(Dataset):
    def __init__(self, data_root, transform=None, train=False, flip=False):
        self.root_path = data_root

        prefix = "train" if train else "val"
        self.prefix = prefix
        self.img_list = os.listdir(f"{data_root}/images/{prefix}")

        # get image and ground-truth list
        self.gt_list = {}
        for img_name in self.img_list:
            img_path = f"{data_root}/images/{prefix}/{img_name}"
            gt_path = f"{data_root}/labels/{prefix}/{img_name}"
            self.gt_list[img_path] = gt_path.replace("jpg", "txt")
        self.img_list = sorted(list(self.gt_list.keys()))
        self.nSamples = len(self.img_list)

        self.transform = transform
        self.train = train
        self.flip = flip
        self.patch_size = 256

    def load_data(img_gt_path, train):
     img_path, gt_path = img_gt_path
     img = cv2.imread(img_path)
     img = Image.fromarray(cv2.cvtColor(img, cv2.COLOR_BGR2RGB))
     # points = io.loadmat(gt_path)['image_info'][0][0][0][0][0][:, ::-1]
     points = []
     with open(gt_path, 'r') as file:
         for line in file:
             x, y = line.strip().split(' ')
             points.append([float(x), float(y)])
     points = np.array(points)
     return img, points

Since my labels are txt which store position as x y. Is there any problem? Thanks for replying!

cxliu0 commented 3 weeks ago

The data format of your dataset is different from ours. We load annotated points in a format of [y, x] instead of [x, y].

You can change points.append([float(x), float(y)]) to points.append([float(y), float(x)]), and retrain the model.

mokby commented 3 weeks ago

感谢作者大大回复~原来是这样的原因,我之前还怀疑了一下后来还是没改hhh