dk-liang / FIDTM

[IEEE TMM] Focal Inverse Distance Transform Maps for Crowd Localization
MIT License
169 stars 41 forks source link

It takes 24 hours to train 10 epoch, about 246,080 pictures, is there something wrong, or it does so? #17

Closed YuYue26 closed 3 years ago

YuYue26 commented 3 years ago

parameter of the computer: GPU:1*1080Ti, 12GB

parameter of the train code:
crop_size:128*128 batch_size:16

dk-liang commented 3 years ago

Hi, What dataset do you train? Actually, I trained with a single GPU (2080TI), it takes about 20s for each epoch in the Part A dataset. Hence, I think it is impossible to only train 10 epochs with 24 hours, even in 1080TI.

YuYue26 commented 3 years ago

The dataset names DroneCrowd, which contain 24,600 pictures in the train dataset. I think it's difficult to train this datasets with HRNet, it takes me a long time!

------------------ 原始邮件 ------------------ 发件人: @.>; 发送时间: 2021年6月23日(星期三) 下午5:10 收件人: @.>; 抄送: @.>; @.>; 主题: Re: [dk-liang/FIDTM] It takes 24 hours to train 10 epoch, about 246,080 pictures, is there something wrong, or it does so? (#17)

Hi, What dataset do you train? Actually, I trained with a single GPU (2080TI), it takes about 20s for each epoch in the Part A dataset. Hence, I think it is impossible to only train 10 epochs with 24 hours, even in 1080TI. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

dk-liang commented 3 years ago

This is a huge dataset. I think you can try to utilize the VGG16 as the backbone.