garyzhao / SemGCN

The Pytorch implementation for "Semantic Graph Convolutional Networks for 3D Human Pose Regression" (CVPR 2019).
https://arxiv.org/abs/1904.03345
Apache License 2.0
461 stars 78 forks source link

It seems that the code for "SemGraphConv" is different from the paper #7

Closed luzzou closed 4 years ago

luzzou commented 4 years ago

hi, thank you for your releasing code, that's great. I have some question about the graphConv operation.

It seems that Eq. 2 in the paper performs F.softmax(M A), however, in the releasing code, it performs F.softmax(A) M. Is there any difference, and which one is better?

fabro66 commented 4 years ago

Hi, I also have a question about Eq.3 which is different from the code. The paper says that M(M_d in paper) is a learnable parameter, but in the source code you published, M is not a learnable parameter.

Eq.3

    h0 = torch.matmul(input, self.W[0])
    h1 = torch.matmul(input, self.W[1])

    adj = -9e15 * torch.ones_like(self.adj).to(input.device)
    adj[self.m] = self.e
    adj = F.softmax(adj, dim=1)

    M = torch.eye(adj.size(0), dtype=torch.float).to(input.device)
    output = torch.matmul(adj * M, h0) + torch.matmul(adj * (1 - M), h1)
garyzhao commented 4 years ago

hi, thank you for your releasing code, that's great. I have some question about the graphConv operation.

It seems that Eq. 2 in the paper performs F.softmax(M A), however, in the releasing code, it performs F.softmax(A) M. Is there any difference, and which one is better?

Hi @trumDog

The releasing code is implemented following Eq. 2 in the paper. However, the notations are different.

The 'M * A' in Eq. 2 in the paper is implemented by:

adj = -9e15 * torch.ones_like(self.adj).to(input.device)
adj[self.m] = self.e

in models/sem_graph_conv.py

While the 'M' in the code is an identity matrix which implements Eq. 9. Please check Sect. A.2 (page 12) in https://arxiv.org/pdf/1904.03345.pdf

Best, Long

garyzhao commented 4 years ago

Hi, I also have a question about Eq.3 which is different from the code. The paper says that M(M_d in paper) is a learnable parameter, but in the source code you published, M is not a learnable parameter.

Eq.3

    h0 = torch.matmul(input, self.W[0])
    h1 = torch.matmul(input, self.W[1])

    adj = -9e15 * torch.ones_like(self.adj).to(input.device)
    adj[self.m] = self.e
    adj = F.softmax(adj, dim=1)

    M = torch.eye(adj.size(0), dtype=torch.float).to(input.device)
    output = torch.matmul(adj * M, h0) + torch.matmul(adj * (1 - M), h1)

Hi @fabro66

The notations are not the same.

The 'M' in the code is an identity matrix which implements Eq. 9 in Sect. A.2 (page 12) in https://arxiv.org/pdf/1904.03345.pdf

While 'M * A' in the paper is implemented by:

adj = -9e15 * torch.ones_like(self.adj).to(input.device)
adj[self.m] = self.e

in models/sem_graph_conv.py, and self.e (which represents all non-zero elements of M in the paper) is learnable here.

BTW, Eq. 2 is implemented in models/sem_graph_conv.py, and Eq. 3 is implemented in models/sem_ch_graph_conv.py

Best, Long

luzzou commented 4 years ago

@garyzhao Thank you for your reply, I got it. I still have a question about Figure 6 in Appendix. A. In h36m_dataset.py the group joints used in non_local blocks are defined as: h36m_skeleton_joints_group = [[2, 3], [5, 6], [1, 4], **[0, 7]**, [8, 9], [14, 15], [11, 12], [10, 13]],
I just cannot find which joint stands for the 'Spine' joint (index 7), and I saw there is a circle between 'Pelvis' (index 0) and 'Thorax' (index 8), does 'Spine' and 'Thorax' represent the same joint?

garyzhao commented 4 years ago

@trumDog Please check the attached picture. It shows the original indexes (17 joints, in [*]) defined in this repo. Then in order to align with the output of Hourglass, we remove the neck joint, and the new joint indexes become the ones in blue cycles (16 joints). And these indexes are employed to define the h36m_skeleton_joints_group. So 'Spine' and 'Thorax' are not the same here.

Please note that the joint names in this repo are a little bit different than the ones in the paper. The ones in the paper are defined according to "Compositional Human Pose Regression, ICCV 2017".

Best, Long

7A3C59DE8B80FC842811F0701C2A4073

luzzou commented 4 years ago

@garyzhao That's great, thank you very much!

luzzou commented 4 years ago

@garyzhao Maybe 'Neck' and 'Thorax' in Figure 6. left in the Appendix should be replaced by 'Thorax' and 'Spine', respectively.

fabro66 commented 4 years ago

Hi, I also have a question about Eq.3 which is different from the code. The paper says that M(M_d in paper) is a learnable parameter, but in the source code you published, M is not a learnable parameter. Eq.3

    h0 = torch.matmul(input, self.W[0])
    h1 = torch.matmul(input, self.W[1])

    adj = -9e15 * torch.ones_like(self.adj).to(input.device)
    adj[self.m] = self.e
    adj = F.softmax(adj, dim=1)

    M = torch.eye(adj.size(0), dtype=torch.float).to(input.device)
    output = torch.matmul(adj * M, h0) + torch.matmul(adj * (1 - M), h1)

Hi @fabro66

The notations are not the same.

The 'M' in the code is an identity matrix which implements Eq. 9 in Sect. A.2 (page 12) in https://arxiv.org/pdf/1904.03345.pdf

While 'M * A' in the paper is implemented by:

adj = -9e15 * torch.ones_like(self.adj).to(input.device)
adj[self.m] = self.e

in models/sem_graph_conv.py, and self.e (which represents all non-zero elements of M in the paper) is learnable here.

BTW, Eq. 2 is implemented in models/sem_graph_conv.py, and Eq. 3 is implemented in models/sem_ch_graph_conv.py

Best, Long

Thank you for your detailed reply. Great work~