AutodeskAILab / UV-Net

Code for UV-Net: Learning from Boundary Representations, CVPR 2021.
MIT License
82 stars 14 forks source link

Question on Backpropagation chain from classification back to UV-grid values #9

Closed Danelrf closed 2 years ago

Danelrf commented 2 years ago

Hi all,

I have a question regarding the learning process. On training, UV-Net not only fits the graph weights (topological level) through message passing of the 64-D embeddings on nodes and edges, but also trains these embeddings through regular 2D and 1D convolutions from the original U(V)-grids (geometric level).

My question is: how does a UV-Net model know how to calculate the gradients at the geometric level when carrying out backpropagation, when classification is at the topological (graph level)?

Or put differently, when the weights of graph nodes and edges have been optimized, how does propagation continue all the way back throughout the face/edge embedding process?

I believe there is a disconnect between these two those concepts, but in the source code it seems to work as one thing.

I am currently trying to implement some explainability methods to the model, and this missing link has me stumped.

From the original code:`class UVNetClassifier(nn.Module):

#UV-Net solid classification model

def __init__(
    self,
    num_classes,
    crv_emb_dim=64,
    srf_emb_dim=64,
    graph_emb_dim=128,
    dropout=0.3,
):
    """
    Initialize the UV-Net solid classification model

    Args:
        num_classes (int): Number of classes to output
        crv_emb_dim (int, optional): Embedding dimension for the 1D edge UV-grids. Defaults to 64.
        srf_emb_dim (int, optional): Embedding dimension for the 2D face UV-grids. Defaults to 64.
        graph_emb_dim (int, optional): Embedding dimension for the graph. Defaults to 128.
        dropout (float, optional): Dropout for the final non-linear classifier. Defaults to 0.3.
    """
    super().__init__()
    self.curv_encoder = uvnet.encoders.UVNetCurveEncoder(
        in_channels=6, output_dims=crv_emb_dim
    )
    self.surf_encoder = uvnet.encoders.UVNetSurfaceEncoder(
        in_channels=7, output_dims=srf_emb_dim
    )
    self.graph_encoder = uvnet.encoders.UVNetGraphEncoder(
        srf_emb_dim, crv_emb_dim, graph_emb_dim,
    )
    self.clf = _NonLinearClassifier(graph_emb_dim, num_classes, dropout)

def forward(self, batched_graph):
    """
    Forward pass

    Args:
        batched_graph (dgl.Graph): A batched DGL graph containing the face 2D UV-grids in node features
                                   (ndata['x']) and 1D edge UV-grids in the edge features (edata['x']).

    Returns:
        torch.tensor: Logits (batch_size x num_classes)
    """
    # Input features
    input_crv_feat = batched_graph.edata["x"]
    input_srf_feat = batched_graph.ndata["x"]
    # Compute hidden edge and face features
    hidden_crv_feat = self.curv_encoder(input_crv_feat)
    hidden_srf_feat = self.surf_encoder(input_srf_feat)

   _**# >>>>> How is backpropagation possible at this point? <<<<<<<**_

    # Message pass and compute per-face(node) and global embeddings
    # Per-face embeddings are ignored during solid classification
    _, graph_emb = self.graph_encoder(
        batched_graph, hidden_srf_feat, hidden_crv_feat
    )
    # Map to logits
    out = self.clf(graph_emb)
    return out`

Thank you very much in advance, and apologies if this is not the right place to post this.

pradeep-pyro commented 2 years ago

Hi Danel,

The entire network is trained end-to-end, so the geometric and topological encoders are all definitely supposed to work as one. In the piece of code you pointed out, the hidden_crv_feat and hidden_srf_feat tensors are computed as outputs of the convolutional layers self.curv_encoder and self.surf_encoder and passed as input node and edge embeddings (2nd and 3rd arguments) to the graph encoder self.graph_encoder. So they are linked together in the computation graph and enable end-to-end backpropagation. The idea is to learn both geometric and topological features in a way that benefits the downstream task the most (classification in this case). Let me know if you need any further clarifications.

pradeep-pyro commented 2 years ago

Closing this issue. Feel free to reopen if you have further questions.