streamer-AP / CGNet

Weakly supverised individual counting
28 stars 1 forks source link

Train Problems #5

Closed zzzxh7834 closed 3 weeks ago

zzzxh7834 commented 1 month ago

Hello, I have tried to train CGNet on SenseCrowd dataset for several times, but the WRAE of my best result is 14.79%. May I ask for more details for your training options?

Additionally, I wonder how to train a model only with In-Out labels, since it seems that the training still needs ID labels, especially the calculation of the loss.

streamer-AP commented 1 month ago

Hi, the config I used is just same as the config.json, and I used four GPUs to reproduce this repo and get the WRAE=11.76. Can you provide more details about your problems? The In-Out labels are not necessary, it is only used to compute the IN-OUT labels, in line 129 of tri_cropper.py, you can see that the loss is computed by: def loss(self, z1,z2,y1,y2): loss_dict = self.ot_loss([z1,y1], [z2,y2])

loss = self.ot_loss(z1, z2)

    loss_dict["all"] = loss_dict["scon_cost"]+loss_dict["hinge_cost"]*0.1
    return loss_dict
 where z1 z2, y1, y2 are [z1, y1] is the shared objects, outflow objects in previous frame and [z2, y2] is the shared objects and inflow objects in current frame. If you can split your data like this, the loss can be calculated. For more detail, you can see the ___main__ function in tri_sim_ot_b.py, where the input is totally random.
streamer-AP commented 1 month ago

Have you removed the duplicated ids in the sensecrowd dataset before training and testing?

zzzxh7834 commented 1 month ago

Sorry, I have just misstated. Instead of the loss, it is pos_match_acc and all_match_acc that seem to use ID labels during validation, since pairs of correct match is needed for calculation. As for removing the duplicated ids, I guess no, I use the original dataset and didn't change the code too much, and I get the MAE=9.56, MSE=227.26, WRAE=14.79%, RMSE=15.08 after training.

streamer-AP commented 1 month ago

Yes, during validation we can use those metrics to validate the rate the model gets the correct match. It can be replaced by pair-level MAE with IN-OUT labels only. Since your training result is close to this repo, I thought the problem may come from the duplicated IDs. I removed them before training and testing. To verify this, you can evaluate the weight I provided? Can you get the same results as I reported?

zzzxh7834 commented 1 month ago

OK, I evaluate the pretrained model and get the MAE=8.94, MSE=371.35, WRAE=11.87%, RMSE=19.27, a little worse than what you reported.

streamer-AP commented 1 month ago

So part of the difference may caused by the duplicated IDs. Also, I note that you get quite better MSE but worse WRAE, I guess the model you trained may be more suitable for short videos. It may be caused by thresholds or TTL, making the model tend to be too strict to judge an inflow.

zzzxh7834 commented 1 month ago

Thanks a lot, I just modified the ttl and threshold and got MAE=8.75, MSE=228.74, WRAE=12.98%, RMSE=15.12. But how to remove the duplicated IDs? The datasets was downloaded from Baidu disk you provided.

streamer-AP commented 1 month ago

The baidu disk link is the original sensecrowd dataset. Some of the IDs are duplicated in frames, it will be better to remove them. Here is the code I used:

``

from glob import glob import os import numpy as np

src_root="annotations" dst_root="new_annotations" for anno_path in glob(os.path.join(src_root,"*.txt")): new_anno_path=os.path.join(dst_root,os.path.basename(anno_path)) fw=open(new_anno_path,"w") with open(anno_path) as f: lines=f.readlines() for line in lines: line=line.split() file_name=line[0] width,height=int(line[1]),int(line[2])

        data=[float(x) for x in line[3:] if x!=""]
        fw.write(f"{file_name} {width} {height} ")
        if len(data)>0:
            data=np.array(data)
            data=np.reshape(data,(-1,7))
            cnt=len(data)
            ids=data[:cnt,6].reshape(-1,1)
            pts=data[:cnt,4:6]
            bboxes=data[:cnt,0:4]

            id_sets=set()

            for idx,id in enumerate(ids):
                if id[0] not in id_sets:
                    id_sets.add(id[0])
                    fw.write(f"{bboxes[idx,0]} {bboxes[idx,1]} {bboxes[idx,2]} {bboxes[idx,3]} {pts[idx,0]} {pts[idx,1]} {int(id[0])} ")
                else:
                    print(f"Duplicate id {id} in {anno_path} {file_name}")
                    continue
        fw.write("\n")
fw.close()