Learning rate decrease code problem

zwithz commented 3 years ago

Hello, I have been reviewing your paper and code (RootNet & PoseNet) for several days. I'd like to mention that the learning rate decrease code is implemented in the wrong way.

For instance, Line 77 used a local variable e, I guess that line 78-84 need to be indented by 4 spaces? https://github.com/mks0601/3DMPPE_ROOTNET_RELEASE/blob/8bef0cd332c3423050a6f3b382d2a574623e1ffa/common/base.py#L77

Your Code

    def set_lr(self, epoch):
        for e in cfg.lr_dec_epoch:
            if epoch < e:
                break
        if epoch < cfg.lr_dec_epoch[-1]:
            idx = cfg.lr_dec_epoch.index(e)
            for g in self.optimizer.param_groups:
                g['lr'] = cfg.lr / (cfg.lr_dec_factor ** idx)
        else:
            for g in self.optimizer.param_groups:
                g['lr'] = cfg.lr / (cfg.lr_dec_factor ** len(cfg.lr_dec_epoch))

I guess this is the right code? 🤔

    def set_lr(self, epoch):
        for e in cfg.lr_dec_epoch:
            if epoch < e:
                break
            if epoch < cfg.lr_dec_epoch[-1]:
                idx = cfg.lr_dec_epoch.index(e)
                for g in self.optimizer.param_groups:
                    g['lr'] = cfg.lr / (cfg.lr_dec_factor ** idx)
            else:
                for g in self.optimizer.param_groups:
                    g['lr'] = cfg.lr / (cfg.lr_dec_factor ** len(cfg.lr_dec_epoch))

BTW, I trained/tested several times and used several protocols and datasets (Human3.6M Protocol2 / MuCo / 3DPW), but I can not reproduce the precision you have mentioned in the README file or paperwork? Maybe the problem above caused it?

mks0601 commented 3 years ago

Hi,

You can test the below codes work well

for e in range(10):
    print(e)
print(e)

This means we can refer e outside the for loop.

What is your current precision? Which datasets did you use for the training?

zwithz commented 3 years ago

This means we can refer e outside the for loop.

Okay, Python... 😳

Here you go, did you train your model based on backbone ResNet 101 or something? The following results are based on ResNet 50.

Dataset	Epoch	Time	Speed (it/s)	$AP^{BOX}$ (%)
Baseline				43.80
MuPoTS	0	4:07	1.52	19.28
	1	2:31	1.07	23.41
	2	2:36	1.07	24.97
	3	2:32	1.07	29.92
	4	2:33	1.06	31.72
	5	2:35	1.05	33.29
	6	2:31	1.08	31.80
	7	2:32	1.07	29.69
	8	2:27	1.11	34.75
	9	2:20	1.16	31.70
	10	2:18	1.18	31.87
	11	2:21	1.15	33.80
	12	2:23	1.14	35.45
	13	2:21	1.15	25.34
	14	2:22	1.15	34.55
	15	2:20	1.16	39.24
	16	2:23	1.14	36.85
	17	2:22	1.14	35.46
	18	2:18	1.17	33.22
	19	2:20	1.16	34.09

Dataset	Epoch	Time	Speed (it/s)	$AP^{root}_{25}$ (%)
Baseline				28.50
MuPoTS	0	6:54	1.30	14.61
	1	6:39	1.35	24.18
	2	6:45	1.34	24.86
	3	6:44	1.34	29.28
	4	6:45	1.33	26.47
	5	6:42	1.34	29.96
	6	6:37	1.36	30.34
	7	6:36	1.37	32.85
	8	6:47	1.33	33.15
	9	6:46	1.33	31.04
	10	6:43	1.34	29.09
	11	6:33	1.38	32.77
	12	6:40	1.34	33.70
	13	6:45	1.34	31.85
	14	6:44	1.34	34.97
	15	6.49	1.32	32.55
	16	6:57	1.29	33.42
	17	6:50	1.32	34.47
	18	6:25	1.31	31.85
	19	7:38	1.18	32.79

Dataset	Epoch	Time	Speed (it/s)	MRPE	MRPE_x	MRPE_y	MRPE_z
Baseline				0.386	0.045	0.094	0.353
3DPW	0	2:53	1.60	0.563	0.070	0.143	0.504
	1	2:48	1.65	0.541	0.069	0.138	0.483
	2	2:48	1.65	0.491	0.061	0.116	0.448
	3	2:49	1.64	0.542	0.065	0.136	0.489
	4	2:48	1.65	0.519	0.059	0.122	0.475
	5	2:49	1.64	0.444	0.055	0.109	0.401
	6	2:48	1.65	0.448	0.057	0.106	0.407
	7	2:48	1.65	0.418	0.055	0.109	0.377
	8	2:50	1.63	0.478	0.055	0.112	0.438
	9	2:50	1.63	0.543	0.061	0.113	0.506
	10	2:46	1.67	0.491	0.060	0.114	0.451
	11	2:49	1.64	0.481	0.053	0.112	0.442
	12	2:49	1.64	0.495	0.056	0.107	0.459
	13	2:49	1.64	0.432	0.050	0.099	0.397
	14	2:48	1.65	0.503	0.055	0.098	0.470
	15	2:49	1.64	0.448	0.054	0.097	0.415
	16	2:48	1.65	0.440	0.055	0.096	0.407
	17	2:48	1.65	0.460	0.054	0.095	0.428
	18	2:50	1.63	0.462	0.054	0.097	0.429
	19	3:00	1.54	0.437	0.053	0.095	0.405

Dataset	Epoch	Time	Speed (it/s)	MRPE	MRPE_x	MRPE_y	MRPE_z	Directions	Discussion	Eating	Greeting	Phoning	Posing	Purchases	Sitting	Sitting Down	Smoking	Photo	Waiting	Walking	Walk Dog	Walk Together
Baseline				120.00	23.3	23.0	108.1
Human3.6M	0	0:41	1.66	128.30	33.83	47.78	98.24	75.68	103.26	118.83	106.74	129.65	80.36	100.74	204.66	283.18	132.22	121.89	104.13	90.10	152.61	96.89
	1	0:36	1.86	148.07	28.39	27.99	133.46	124.77	126.71	153.52	132.18	139.22	115.90	145.83	184.37	244.23	160.95	133.51	123.70	134.60	166.31	134.66
	2	0:36	1.86	141.87	26.95	29.71	127.02	130.51	119.52	154.68	128.15	132.46	118.38	138.51	180.12	251.67	145.65	130.00	115.84	108.41	162.61	114.06
	3	0:36	1.88	104.33	25.68	30.44	84.89	68.53	85.59	115.64	103.10	101.17	70.79	102.51	143.14	193.27	100.63	106.72	95.44	75.45	121.34	80.66
	4	0:36	1.88	177.32	27.48	30.24	166.32	166.79	160.97	186.02	157.22	171.35	155.79	176.81	208.59	270.88	189.83	172.30	142.99	146.76	199.91	148.10
	5	0:35	1.91	153.15	26.46	29.38	141.45	128.27	124.61	153.30	135.66	138.08	120.57	162.19	178.03	340.77	147.35	146.93	126.21	116.44	184.68	121.23
	6	0:35	1.90	124.33	24.84	26.26	110.89	125.75	114.95	111.07	128.06	103.07	117.14	135.63	127.50	197.39	116.56	128.53	119.17	115.38	141.77	117.53
	7	0:34	1.95	124.65	24.55	25.97	111.31	92.01	104.58	134.57	107.96	126.24	89.02	116.49	164.28	239.92	131.30	119.69	101.37	93.26	142.97	91.37
	8	0:35	1.91	127.66	24.77	26.11	114.54	131.45	114.40	122.83	132.40	117.81	122.17	127.14	133.71	189.32	133.62	122.78	120.70	110.27	133.39	112.70
	9	0:36	1.87	112.76	23.84	23.23	99.61	106.35	98.66	111.65	110.51	102.71	98.49	108.84	144.35	181.63	116.35	104.50	101.67	92.18	120.19	95.03
	10	0:35	1.92	97.15	23.15	22.76	82.93	81.34	88.57	89.45	99.72	83.58	78.10	100.22	105.95	195.36	92.19	94.14	96.28	75.48	108.98	80.65
	11	0:35	1.93	134.05	23.27	24.56	122.79	137.86	114.01	125.26	131.74	117.92	120.86	134.79	141.23	240.51	133.14	126.71	122.42	115.48	153.39	122.03
	12	0:35	1.90	136.22	23.07	23.29	126.31	130.01	116.07	134.56	128.38	129.03	116.16	138.07	155.80	235.10	142.73	126.19	116.39	113.95	151.77	117.99
	13	0:35	1.92	123.45	22.97	23.04	112.38	126.41	109.05	116.91	123.70	107.77	111.75	124.86	122.90	228.95	120.28	114.80	112.39	105.36	138.63	111.86
	14	0:36	1.87	136.61	24.06	23.36	126.42	143.47	116.17	130.49	139.71	123.66	131.57	133.79	140.94	223.92	134.35	126.87	124.56	126.09	152.46	128.56
	15	0:36	1.89	123.44	23.08	23.69	112.04	118.46	105.14	121.42	120.36	111.06	105.90	120.15	139.26	239.23	119.25	114.12	113.56	102.26	132.18	104.22
	16	0:36	1.85	135.81	23.50	23.18	125.48	124.58	113.57	136.85	123.99	127.35	114.28	141.05	159.83	248.83	139.19	125.14	117.31	108.58	155.80	112.53
	17	0:36	1.88	125.45	22.57	22.59	114.76	127.65	111.01	117.56	124.69	112.39	116.45	125.01	128.80	214.73	123.00	119.02	116.86	108.65	143.69	113.41
	18	0:34	1.95	122.62	22.18	23.05	111.71	125.34	111.14	114.35	124.14	109.78	114.81	122.42	123.42	203.16	120.58	117.28	115.30	107.16	137.71	111.64
	19	0:35	1.93	128.57	22.40	23.04	118.24	130.76	114.70	120.69	127.32	115.82	119.28	129.28	132.06	219.14	127.80	132.36	119.35	110.91	143.87	115.12

mks0601 commented 3 years ago

Please do not copy and paste all raw data.. just let me know your precision and training data. By the way, why there is AP^box? I don't train a human detection model.

zwithz commented 3 years ago

Sorry for not stating clearly, AP^{box} means use_gt_bbox=True here.

Train datasets	Test dataset	`use_gt_bbox`	Your precision	My best precision
MuCo + MSCOCO	MuPoTS	True	43.80(AP)	39.24
MuCo + MSCOCO	MuPoTS	False	28.5(AP)	33.70
MuCo + MSCOCO	3DPW	False	0.386(MRPE)	0.418
Human36M Protocol2 + MPII	Human36M	False	120.0(MRPE)	97.15

All data above are trained/tested on RootNet models, the data on PoseNet is more inconsistent.

mks0601 commented 3 years ago

I see. You'd better check all snapshots, saved during the training stage. The accuracy of RootNet is not very stable due to the high depth ambiguity. However, I confirmed that the performance of PoseNet is stable. Could you let me know your PoseNet results?

zwithz commented 3 years ago

RootNet

The testing results of all the snapshots were here, and I tested several times, the results stayed almost the same. https://github.com/mks0601/3DMPPE_ROOTNET_RELEASE/issues/31#issuecomment-795126548

PoseNet

Train datasets	Test dataset	Eval Metric	Your precision	My best precision
MuCo + MSCOCO	MuPoTS	Sequence-wise 3DPCK_{rel} & Accuracy for all groundtruths	81.8(Avg)	79.6
MuCo + MSCOCO	MuPoTS	Sequence-wise 3DPCK_{rel} & Accuracy only for matched groundtruths	82.5(Avg)	80.91
Human36M Protocol2 + MPII	Human36M	MPJPE	53.3	54.34

Maybe all of the differences between your precision and mine are negligible in some way? I should hack the eval program to compute the average precision for me, otherwise, there will be too many csv files to be handled. Thanks for your prompt reply! Hope you have a great day~

mks0601 commented 3 years ago

I think PoseNet results differences are not that major, but I'm not sure. Please let me know if you can't catch up my results. Thanks!

mks0601 / 3DMPPE_ROOTNET_RELEASE

Learning rate decrease code problem #31

RootNet

PoseNet