Closed tangyudi closed 8 years ago
Hi,shihenw can you see these?
Hi tangyudi,
(1) You are right that we did not use test phase in caffe because that increase memory usage and slower down the training. If you want to monitor the training on the fly, you could run testing code (or any other your own profiling code) on every intermediate dumped caffemodel. (2) We used L2 loss. The number you see on the training log is the sum of square of difference on every heatmap pixel, averaged over batched samples. You could check EuclideanLossLayer in caffe to verify. We did not make modification on this. (3) My final loss can reach 8~10 with 400K iterations on LSP dataset. (4) I guess the 150G lmdb comes from MPI+LSP dataset? My MPI-only lmdb is 113G. The reason why this is bigger than original dataset is that (i) many image are duplicated because of multiple people in the same image, and (ii) there are some wasted space on the 4th channel where I use to put raw labels (however with lmdb's serialization I guess this should be fine).
Hi @shihenw,
the loss you mentioned on LSP (8-10) is this combined loss or per stage?
@ds2268 it's per stage (in particular, the one of last stage).
Hi,I want to train the model,but I have something do not understand. (1)I only see train lmdb, is this means the network do not need test phase? (2)Is the loss of each stage means square(predicted coordinate point - ground truth)? if the loss is 25 can you tell me how it is calculated? (3)after some Iteration the loss is about 25 is it right and what about your final loss? (4)The train lmdb is 150G why it is so big?
Thank you for help! Fully expect the python version.