zhangyux15 / 4d_association

code for cvpr2020 "4D Association Graph for Realtime Multi-person Motion Capture Using Multiple Video Cameras"
185 stars 31 forks source link

detect中的txt获取方法 How to get txts in detect fold. #17

Open XGenietony opened 2 years ago

XGenietony commented 2 years ago

关于paf我大概跑出了一个可用结果,但标签读取还是问题非常大。在detect读取时第一帧可以从debug看到读取结果包括正确的joint和paf(以下为视角1的第一帧joint和pafs值展示) 0 1

但是从第二帧开始就出现明显错误的读取(以下为视角1的第二帧joint和pafs值展示) 2 3

后面几帧的数据情况也几乎都是如此,我怀疑是否有一些tab或者必要的占位符我未能加入导致标签读取异常。 现在的标签格式已经被组织成如下形式: joint 4

pafs: 5

两帧分割情况: 6

其他两个view也是相同的情况,我很难理解,如果我的paf组织有一定问题,那也可能会导致第二帧开始无法读取joint坐标,而不是出现一个-8.28.......e+11的错误值,我也无法理解这个值到底是怎么算出来的。希望能给出一个标准的detect标签模板,方便我们在其他数据集上复现结果.

以上为询问作者的邮件内容和我个人到现阶段的工作。但还无法在自己的视频中复现出结果,希望能有同样在努力复现此工作的同学讨论。

About the coordinates of joints in txt, i get the coordinates by openpose demohttps://github.com/CMU-Perceptual-Computing-Lab/openpose.git. The data structure looks like this: 1 0.210808 (x) 0.936698 (y) 0.760037 (score)

we can get these by set parameters '--keypoint_scale' as 3 and write them in json. The tip in Openpose: DEFINE_int32(keypoint_scale, 0, "Scaling of the (x,y) coordinates of the final pose data array, i.e., the scale of the (x,y) coordinates that will be saved with the write_json & write_keypoint flags. Select 0 to scale it to the original source resolution; 1to scale it to the net output size (set with net_resolution); 2 to scale it to the final output size (set with resolution); 3 to scale it in the range [0,1], where (0,0) would be the top-left corner of the image, and (1,1) the bottom-right one; and 4 for range [-1,1], where (-1,-1) would be the top-left corner of the image, and (1,1) the bottom-right one. Non related with scale_number and scale_gap.");https://github.com/CMU-Perceptual-Computing-Lab/openpose/blob/master/doc/advanced/demo_advanced.md#all-flags

the pafs can be got by this #5 or https://arvrjourney.com/human-pose-estimation-using-openpose-with-tensorflow-part-2-e78ab9104fc8

i use these codes calculate values look like pafs, but not sure.

def get_score(x1, y1, x2, y2, pafMatX, pafMatY):
    num_inter = 10
    dx, dy = x2 - x1, y2 - y1
    normVec = math.sqrt(dx ** 2 + dy ** 2)

    if normVec < 1e-4:
        return 0.0, 0

    vx, vy = dx / normVec, dy / normVec

    xs = np.arange(x1, x2, dx / num_inter) if x1 != x2 else np.full((num_inter, ), x1)
    ys = np.arange(y1, y2, dy / num_inter) if y1 != y2 else np.full((num_inter, ), y1)
    xs = (xs + 0.5).astype(np.int8)
    ys = (ys + 0.5).astype(np.int8)

    # without vectorization
    pafXs = np.zeros(num_inter)
    pafYs = np.zeros(num_inter)
    for idx, (mx, my) in enumerate(zip(xs, ys)):
        pafXs[idx] = pafMatX[my][mx]
        pafYs[idx] = pafMatY[my][mx]

    # vectorization slow?
    # pafXs = pafMatX[ys, xs]
    # pafYs = pafMatY[ys, xs]

    local_scores = pafXs * vx + pafYs * vy
    thidxs = local_scores > Inter_Threashold

    return sum(local_scores * thidxs), sum(thidxs)
jjkislele commented 2 years ago

Actually, there is no need to arrange the data format with \t or \n. Cause in the source code, they only use simple command fs >> detection.joints[jIdx](i, j); and fs >> detection.pafs[pafIdx](i, j); to read the data, the float number respectively.

The way of your PAF code is right. The point is, SADLY, the way they used and announced in the paper is not as same as their code goes, at least not as same as I thought.

879917820 commented 2 years ago

Actually, there is no need to arrange the data format with or . Cause in the source code, they only use simple command and to read the data, the float number respectively.\t``\n``fs >> detection.joints[jIdx](i, j);``fs >> detection.pafs[pafIdx](i, j);

The way of your PAF code is right. The point is, SADLY, the way they used and announced in the paper is not as same as their code goes, at least not as same as I thought.

I use this code to calculate the pafs score, but the output is [[0.93091998, 0.69100239],[0.67352021, 0.78826275]]. It is very different from [[0,0.865738],[0.781756,0]]. Are pafXs and pafYs obtained through --heatmaps_add_PAFs in openpose?

879917820 commented 2 years ago

The shape of the PAFS heatmap I obtained is (52, H, W), and it is a one-channel grayscale image. The score I calculated using this heatmap is like this [[237.38459491, -57.92792818],[171.74765283, 201.00700028]]. Do you know that my PAFS heatmap is correct?

879917820 commented 2 years ago

关于paf我大概跑出了一个可用结果,但标签读取还是问题非常大。在detect读取时第一帧可以从debug看到读取结果包括正确的joint和paf(以下为视角1的第一帧joint和pafs值展示) 0 1

但是从第二帧开始就出现明显错误的读取(以下为视角1的第二帧joint和pafs值展示) 2 3

后面几帧的数据情况也几乎都是如此,我怀疑是否有一些tab或者必要的占位符我未能加入导致标签读取异常。 现在的标签格式已经被组织成如下形式: joint 4

pafs: 5

两帧分割情况: 6

其他两个view也是相同的情况,我很难理解,如果我的paf组织有一定问题,那也可能会导致第二帧开始无法读取joint坐标,而不是出现一个-8.28.......e+11的错误值,我也无法理解这个值到底是怎么算出来的。希望能给出一个标准的detect标签模板,方便我们在其他数据集上复现结果.

以上为询问作者的邮件内容和我个人到现阶段的工作。但还无法在自己的视频中复现出结果,希望能有同样在努力复现此工作的同学讨论。

About the coordinates of joints in txt, i get the coordinates by openpose demohttps://github.com/CMU-Perceptual-Computing-Lab/openpose.git. The data structure looks like this: 1 0.210808 (x) 0.936698 (y) 0.760037 (score)

we can get these by set parameters '--keypoint_scale' as 3 and write them in json. The tip in Openpose: DEFINE_int32(keypoint_scale, 0, "Scaling of the (x,y) coordinates of the final pose data array, i.e., the scale of the (x,y) coordinates that will be saved with the write_json & write_keypoint flags. Select 0 to scale it to the original source resolution; 1to scale it to the net output size (set with net_resolution); 2 to scale it to the final output size (set with resolution); 3 to scale it in the range [0,1], where (0,0) would be the top-left corner of the image, and (1,1) the bottom-right one; and 4 for range [-1,1], where (-1,-1) would be the top-left corner of the image, and (1,1) the bottom-right one. Non related with scale_number and scale_gap.");https://github.com/CMU-Perceptual-Computing-Lab/openpose/blob/master/doc/advanced/demo_advanced.md#all-flags

the pafs can be got by this #5 or https://arvrjourney.com/human-pose-estimation-using-openpose-with-tensorflow-part-2-e78ab9104fc8

i use these codes calculate values look like pafs, but not sure.

def get_score(x1, y1, x2, y2, pafMatX, pafMatY):
    num_inter = 10
    dx, dy = x2 - x1, y2 - y1
    normVec = math.sqrt(dx ** 2 + dy ** 2)

    if normVec < 1e-4:
        return 0.0, 0

    vx, vy = dx / normVec, dy / normVec

    xs = np.arange(x1, x2, dx / num_inter) if x1 != x2 else np.full((num_inter, ), x1)
    ys = np.arange(y1, y2, dy / num_inter) if y1 != y2 else np.full((num_inter, ), y1)
    xs = (xs + 0.5).astype(np.int8)
    ys = (ys + 0.5).astype(np.int8)

    # without vectorization
    pafXs = np.zeros(num_inter)
    pafYs = np.zeros(num_inter)
    for idx, (mx, my) in enumerate(zip(xs, ys)):
        pafXs[idx] = pafMatX[my][mx]
        pafYs[idx] = pafMatY[my][mx]

    # vectorization slow?
    # pafXs = pafMatX[ys, xs]
    # pafYs = pafMatY[ys, xs]

    local_scores = pafXs * vx + pafYs * vy
    thidxs = local_scores > Inter_Threashold

    return sum(local_scores * thidxs), sum(thidxs)

你好,请问你现在复现出作者的结果了吗?

XGenietony commented 2 years ago

关于paf我大概跑出了一个可用结果,但标签读取还是问题非常大。在detect读取时第一帧可以从debug看到读取结果包括正确的joint和paf(以下为视角1的第一帧joint和pafs值展示) 0 1 但是从第二帧开始就出现明显错误的读取(以下为视角1的第二帧joint和pafs值展示) 2 3 后面几帧的数据情况也几乎都是如此,我怀疑是否有一些tab或者必要的占位符我未能加入导致标签读取异常。 现在的标签格式已经被组织成如下形式: joint 4 pafs: 5 两帧分割情况: 6 其他两个view也是相同的情况,我很难理解,如果我的paf组织有一定问题,那也可能会导致第二帧开始无法读取joint坐标,而不是出现一个-8.28.......e+11的错误值,我也无法理解这个值到底是怎么算出来的。希望能给出一个标准的detect标签模板,方便我们在其他数据集上复现结果. 以上为询问作者的邮件内容和我个人到现阶段的工作。但还无法在自己的视频中复现出结果,希望能有同样在努力复现此工作的同学讨论。 About the coordinates of joints in txt, i get the coordinates by openpose demohttps://github.com/CMU-Perceptual-Computing-Lab/openpose.git. The data structure looks like this: 1 0.210808 (x) 0.936698 (y) 0.760037 (score) we can get these by set parameters '--keypoint_scale' as 3 and write them in json. The tip in Openpose: DEFINE_int32(keypoint_scale, 0, "Scaling of the (x,y) coordinates of the final pose data array, i.e., the scale of the (x,y) coordinates that will be saved with the write_json & write_keypoint flags. Select 0 to scale it to the original source resolution; 1to scale it to the net output size (set with net_resolution); 2 to scale it to the final output size (set with resolution); 3 to scale it in the range [0,1], where (0,0) would be the top-left corner of the image, and (1,1) the bottom-right one; and 4 for range [-1,1], where (-1,-1) would be the top-left corner of the image, and (1,1) the bottom-right one. Non related with scale_number and scale_gap.");https://github.com/CMU-Perceptual-Computing-Lab/openpose/blob/master/doc/advanced/demo_advanced.md#all-flags the pafs can be got by this #5 or https://arvrjourney.com/human-pose-estimation-using-openpose-with-tensorflow-part-2-e78ab9104fc8 i use these codes calculate values look like pafs, but not sure.

def get_score(x1, y1, x2, y2, pafMatX, pafMatY):
    num_inter = 10
    dx, dy = x2 - x1, y2 - y1
    normVec = math.sqrt(dx ** 2 + dy ** 2)

    if normVec < 1e-4:
        return 0.0, 0

    vx, vy = dx / normVec, dy / normVec

    xs = np.arange(x1, x2, dx / num_inter) if x1 != x2 else np.full((num_inter, ), x1)
    ys = np.arange(y1, y2, dy / num_inter) if y1 != y2 else np.full((num_inter, ), y1)
    xs = (xs + 0.5).astype(np.int8)
    ys = (ys + 0.5).astype(np.int8)

    # without vectorization
    pafXs = np.zeros(num_inter)
    pafYs = np.zeros(num_inter)
    for idx, (mx, my) in enumerate(zip(xs, ys)):
        pafXs[idx] = pafMatX[my][mx]
        pafYs[idx] = pafMatY[my][mx]

    # vectorization slow?
    # pafXs = pafMatX[ys, xs]
    # pafYs = pafMatY[ys, xs]

    local_scores = pafXs * vx + pafYs * vy
    thidxs = local_scores > Inter_Threashold

    return sum(local_scores * thidxs), sum(thidxs)

你好,请问你现在复现出作者的结果了吗?

我用上文说的PAF算法算了一下分数,然后进行计算得到了一份标签。输入模型以后发现有非常多的两个目标关节互相错连的情况,这导致目标追踪无法完成,更无法作出理想的结果了。

jjkislele commented 2 years ago

@XGenietony @879917820 Almost, You almost touch the truth. Some of the PAF scores should be set to 0, as the same as the original "provided" detection result goes.

Feyily commented 2 years ago

@XGenietony I have reproduced the author's results , if you still need some help?

JonathanLehner commented 2 years ago

@Feyily you reproduced the results? how did you write the .txt files? Would you mind to talk on Wechat jonathanlehner2

ZCRNK commented 2 years ago

@Feyily
Hi :) I am currently trying to reproduce the author's results but fail to obtain the same paf association scores.

As input for my transformation code I use the keypoints obtained by enabling the write_json flag and the PAF heatmaps obtained by enabling the add_heatmaps_pafs when running openpose.

The keypoints my transformation outputs are more or less the same as on the original detection .txt files. However the probability score of the keypoints I obtain differ quite a bit from those on the detection .txt files.

Above all, however, the association scores I obtain are very different from those found on the detection .txt files. I compute the scores according to the code posted in this thread and I actually do not understand where it defers semantically from the code used in Openpose for calculating the association scores (https://github.com/CMU-Perceptual-Computing-Lab/openpose/blob/master/src/openpose/net/bodyPartConnectorBase.cpp) apart from some very minor differences like the amount of intermediate steps used for summing up the local scores. Even after adjusting these minor differences I still get the same wrong results.

This is my code (which should produce the keypoints and association scores for one frame of the video): `import array import json import numpy as np import math

from https://github.com/zhangyux15/4d_association/issues/17

adapted according to https://github.com/CMU-Perceptual-Computing-Lab/openpose/blob/master/src/openpose/net/bodyPartConnectorBase.cpp

def get_score(x1, y1, x2, y2, pafMatX, pafMatY):

dx, dy = x2 - x1, y2 - y1
normVec = math.sqrt(dx ** 2 + dy ** 2)

max_d = max(abs(dx), abs(dy))
num_inter = max(5, min(25, round(math.sqrt(5*max_d))))

if normVec < 1e-6:
    return 0.0, 0

vx, vy = dx / normVec, dy / normVec

xs = np.arange(x1, x2, dx / num_inter) if x1 != x2 else np.full((num_inter, ), x1)
ys = np.arange(y1, y2, dy / num_inter) if y1 != y2 else np.full((num_inter, ), y1)
xs = (xs + 0.5).astype(np.int8)
ys = (ys + 0.5).astype(np.int8)

pafXs = np.zeros(num_inter)
pafYs = np.zeros(num_inter)
for idx, (mx, my) in enumerate(zip(xs, ys)):
    pafXs[idx] = pafMatX[my][mx]
    pafYs[idx] = pafMatY[my][mx]

local_scores = pafXs * vx + pafYs * vy
thidxs = local_scores > 0.3

if (sum(thidxs)/num_inter) > 0.3:
    return round((sum(np.multiply(local_scores, thidxs))/ sum(thidxs)), 6)

else:
    return 0

READ IN KEYPOINTS DATA

open JSON file

f = open('keypoints.json')

return JSON object as a dictionary

data = json.load(f)

initialize array of size 25 X 3 X 1 (#body_parts X #dims X #detected_persons)

arr= [[[],[],[]], [[],[],[]], [[],[],[]], [[],[],[]], [[],[],[]], [[],[],[]], [[],[],[]], [[],[],[]], [[],[],[]], [[],[],[]], [[],[],[]], [[],[],[]], [[],[],[]], [[],[],[]], [[],[],[]], [[],[],[]], [[],[],[]], [[],[],[]], [[],[],[]], [[],[],[]], [[],[],[]], [[],[],[]], [[],[],[]], [[],[],[]], [[],[],[]]]

arr2= [[[],[],[]], [[],[],[]], [[],[],[]], [[],[],[]], [[],[],[]], [[],[],[]], [[],[],[]], [[],[],[]], [[],[],[]], [[],[],[]], [[],[],[]], [[],[],[]], [[],[],[]], [[],[],[]], [[],[],[]], [[],[],[]], [[],[],[]], [[],[],[]], [[],[],[]], [[],[],[]], [[],[],[]], [[],[],[]], [[],[],[]], [[],[],[]], [[],[],[]]]

fill array

l = len(data.get("people"))

for i in range(0, l): kps = data.get("people")[i].get("pose_keypoints_2d") l2 = int(len(kps)/3) assert(l2==25)

for j in range (0, l2):
    arr[j][0].append(kps[3*j])
    arr[j][1].append(kps[3*j+1])
    arr[j][2].append(kps[3*j+2])

remove all keypoints with probability score < threshold

for i in range(0, 25): l2 = len(arr[i][0]) for j in range(0, l2): if arr[i][2][j] > 0.20: arr2[i][0].append(arr[i][0][j]) arr2[i][1].append(arr[i][1][j]) arr2[i][2].append(arr[i][2][j])

WRITE KEYPOINTS DATA

t = open('keypoints.txt', 'w')

def condition(x): return x > 0.5

for i in range (0, 25): l = sum(condition(x) for x in arr[i][2]) t.write(str(l)+"\n")

l2 = len(arr2[i][0])

for j in range(0, l2):
    t.write(str(arr2[i][0][j])+" ")
t.write("\n")

for j in range(0, l2):
    t.write(str(arr2[i][1][j])+" ")
t.write("\n")

for j in range(0, l2):
    t.write(str(arr2[i][2][j])+" ")
t.write("\n")

COMPUTE AND WRITE ASSOCIATION SCORES

x = np.fromfile('heatmaps.float', dtype=np.float32)

assert x[0] == 3 # First parameter saves the number of dimensions (18x300x500 = 3 dimensions)

shape_x = x[1:1+int(x[0])] d1 = int(shape_x[0]) # Size of the first dimension = number PAFS d2 = int(shape_x[1]) # Size of the second dimension = y-direction d3 = int(shape_x[2]) # Size of the third dimension = x-direction pafs = x[1+int(round(x[0])):]

pafs = pafs.reshape(d1, d2, d3)

define the body pairs according to skeleton BODY_25

pairs = [[1,8], [1,2], [1,5], [2,3], [3,4], [5,6], [6,7], [8,9], [9,10], [10,11], [8,12], [12,13], [13,14], [1,0], [0,15], [15,17], [0,16], [16,18], [2,17], [5,18], [14,19], [19,20], [14,21], [11,22], [22,23], [11,24]]

l_pairs = len(pairs)

for i in range(0, l_pairs): p1= pairs[i][0] p2= pairs[i][1]

lp1= len(arr2[p1][0])
lp2= len(arr2[p2][0])

#iterate through p1
for j in range(0, lp1):
    #iterate through p2
    for k in range (0, lp2):
        #get x and y values and scale them (original x and y values are between 0 and 1)
        x1= arr2[p1][0][j] * d3
        x2= arr2[p2][0][k] * d3
        y1= arr2[p1][1][j] * d2
        y2= arr2[p2][1][k] * d2

        score = get_score(x1, y1, x2, y2, pafs[2*i], pafs[2*i+1])
        t.write(str(score)+" ")
    t.write("\n")`

Those are the results I obtain:

keypoints.txt

This would be the correct result:

target.txt

0Xiao0 commented 1 year ago

@XGenietony I have reproduced the author's results , if you still need some help? @Feyily hello, can you tell me the way to get txts in detect folder?