namedBen / Convolutional-Pose-Machines-Pytorch

Pytroch version of Convolutional Pose Machines
99 stars 23 forks source link

loss值效果很不好 #7

Open freedomlei opened 5 years ago

freedomlei commented 5 years ago

Test Iteration: 100 Time 123.292s / 100iters, (1.233) Data load 121.240s / 100iters, (2.424793) Loss = 2535.75683594 (ave = 2515.50237061)

Loss1 = 472.91094971 (ave = 469.05812988) Loss2 = 419.15380859 (ave = 412.87503113) Loss3 = 409.17819214 (ave = 407.73377777) Loss4 = 411.38052368 (ave = 409.59360992) Loss5 = 412.31079102 (ave = 408.39463074) Loss6 = 410.82272339 (ave = 407.84722137) 您好,我想问一下我的loss值一直下不去您有什么方法吗?数据集我用的是lsp数据集

namedBen commented 5 years ago

你这个loss大的离谱,看看是不是label取错了

freedomlei commented 5 years ago

您好,label参数的设置在哪里呢,没有找到,我就在train.py脚本里把数据集加进去了,其他参数没有修改,直接训练的

freedomlei commented 5 years ago

您好,希望得到您的回复

freedomlei commented 5 years ago

我用的lsp数据集,关节点个数14个,label值是没有问题的啊

freedomlei commented 5 years ago

loss1 = criterion(heat1, heatmap_var) heat_weight loss2 = criterion(heat2, heatmap_var) heat_weight loss3 = criterion(heat3, heatmap_var) heat_weight loss4 = criterion(heat4, heatmap_var) heat_weight loss5 = criterion(heat5, heatmap_var) heat_weight loss6 = criterion(heat6, heatmap_var) heat_weight heat_weight = 46 46 15 / 1.0 我想我loss值这么大的原因会不会和乘了heat_weight值有关,我不太明白您在计算loss值的时候为什么要乘上heat_map值,我把这个参数去掉之后loss值算是一个比较可观 的数据了,您对此有啥看法

namedBen commented 5 years ago

之所以×heatmap,是参考CMU原来的caffe的源码(时间过去很久了,我记得当时是这样的)

freedomlei commented 5 years ago

您好,我想问一下您训练迭代了多少次,训练了多久,最终loss值大小是多少

freedomlei commented 5 years ago

我想按您的做一个参考,我训练迭代40000次,一天之后loss值还是400多

namedBen commented 5 years ago

首先跟你说声道歉哈,这个loss是正常的呢,我找到了之前的备份(因为等主板和新显卡的),时间过去太久了,下面是大概的loss日志: Train Iteration: 50 Time 50.870s / 50iters, (1.017) Data load 0.700s / 50iters, (0.014002) Learning rate = 4e-06 Loss = 2869.84130859 (ave = 3553.02456055)

Train Iteration: 137950 Time 45.532s / 50iters, (0.911) Data load 0.008s / 50iters, (0.000156) Learning rate = 6.70659879226e-11 Loss = 1059.18762207 (ave = 1263.58395752)

freedomlei commented 5 years ago

好吧,谢谢

freedomlei commented 5 years ago

loss1 = criterion(heat1, heatmap_var) heat_weight loss2 = criterion(heat2, heatmap_var) heat_weight loss3 = criterion(heat3, heatmap_var) heat_weight loss4 = criterion(heat4, heatmap_var) heat_weight loss5 = criterion(heat5, heatmap_var) heat_weight loss6 = criterion(heat6, heatmap_var) heat_weight 这6个loss是代表6个stage获得的15通道的heatmap图的loss吗

namedBen commented 5 years ago

是的呢

dyf102 commented 5 years ago

Iteration: 60800 Time 141.374s / 50iters, (1.554) Data load 90.138s / 50iters, (1.802758) Learning rate = 4.9185481284000014e-08 Loss = 1116.08874512 (ave = 1295.38367081)

Loss1 = 276.20257568 (ave = 316.23426162) Loss2 = 206.13369751 (ave = 237.10911347) Loss3 = 176.35498047 (ave = 202.00120814) Loss4 = 162.03103638 (ave = 187.44318502) Loss5 = 148.97004700 (ave = 177.27670280) Loss6 = 146.39643860 (ave = 175.31919931) 2019-06-01 15:28:15 ----------------------------------------------------------------------------------------------------------------- Test Iteration: 0 Time 4.000s / 50iters, (4.000) Data load 0.000s / 50iters, (0.000000) Loss = 1539.38256836 (ave = 1539.38256836)

Loss1 = 362.45312500 (ave = 362.45312500) Loss2 = 277.19314575 (ave = 277.19314575) Loss3 = 241.63360596 (ave = 241.63360596) Loss4 = 226.15592957 (ave = 226.15592957) Loss5 = 216.58227539 (ave = 216.58227539) Loss6 = 215.36450195 (ave = 215.36450195) 2019-06-01 15:28:20 ------------------------------------------------------------------------------------------

Does it still need to continue? Thanks