princeton-vl / pose-hg-train

Training and experimentation code used for "Stacked Hourglass Networks for Human Pose Estimation"
Other
575 stars 185 forks source link

inaccuracy caused by transform #33

Closed xizero00 closed 7 years ago

xizero00 commented 7 years ago

Hi newell:

I notice that transform function causes the inaccuracy

function transform(pt, center, scale, rot, res, invert)
    local pt_ = torch.ones(3)
    pt_[1],pt_[2] = pt[1]-1,pt[2]-1

    local t = getTransform(center, scale, rot, res)
    if invert then
        t = torch.inverse(t)
    end
    local new_point = (t*pt_):sub(1,2)

    return new_point:int():add(1)
end

the type of return value is int this will introduce the inaccuracy.

You can find that the variable diff's norm is very big in the following testing code. That's a big problem.

t_pt = transform(pt, center, scale, rot, res, False)
recovered_pt = transform(t_pt, center, scale, rot, res, True)
diff = pt-recovered_pt

In the train.lua, you use the following code to overcome this problem when it's validating or testing Am I right?

 -- Validation: Get flipped output
output = applyFn(function (x) return x:clone() end, output)
local flippedOut = model:forward(flip(input))
flippedOut = applyFn(function (x) return flip(shuffleLR(x)) end, flippedOut)
output = applyFn(function (x,y) return x:add(y):div(2) end, output, flippedOut)
anewell commented 7 years ago

Yes, some detail is lost by rounding to an integer when predicting at a lower output resolution. We address this with a sort of "hacky" solution when postprocessing found here: https://github.com/anewell/pose-hg-train/blob/master/src/util/pose.lua#L85 This provides sufficient localization precision for our purposes.

xizero00 commented 7 years ago

@anewell Thank you very much.