graphnet-team / graphnet

A Deep learning library for neutrino telescopes
https://graphnet-team.github.io/graphnet/
Apache License 2.0
85 stars 86 forks source link

NormalizingFlow #649

Open RasmusOrsoe opened 7 months ago

RasmusOrsoe commented 7 months ago

This PR adds compatibility for normalizing flows to graphnet. Specifically, this PR does the following:

  1. Adds NormalizingFlow in graphnet.models.flows as a new generic model class which all normalizing flows are intended to inherit from. These models are assumed to output a single tensor containing the transformed coordinates and the jacobian associated with the transformation. For book-keeping reasons, the column indices that identifies which columns are transformed coordinates and jacobians, these models have properties coordinate_columns and jacobian_columns. These properties are used to split the tensor for loss computation (see diagram).
  2. Adds a normalizing flow model called INGA which is similar to what is presented in https://arxiv.org/abs/1906.04032 . It relies on splines and the spline functions are currently put ingraphnet.models.flows.spline_blocks and can be used both inside and outside the context of NormalizingFlows (although likely most relevant there).
  3. Adds an example of training INGA under examples/04_training/05_train_normalizing_flow.py. Here the syntax of training the normalizing flow is shown. The example trains INGA to learn a mapping from the rather complicated distribution of the pulse attributes (xyz, time, charge, etc) into a multivariate gaussian distribution.
  4. Introduces slight changes to the inheritance structure of our loss functions: the generic base-class LossFunction no longer has an abstract _forward method; this is left for implementation of further problem specific subclasses. This is introduced here because not all machine learning paradigms follow our usual prediction, target argument structure; which is indeed the case for normalizing flows.
  5. Created a new LossFunction subclass called StandardLossFunction which all of our current loss functions now inherit from. These loss functions all follow the prediction, target argument structure.
  6. Created a new LossFunction subclass FlowLossFunction which follow the prediction, jacobian argument structure.
  7. Created MultivariateGaussianFlowLoss, subclassed of FlowLossFunction, which calculates the nllh of a multivariate Gaussian.
  8. Slight changes to StandardModel that ties everything together. While working on this PR i noticed a few list comprehensions were hard to read; so I took the liberty of writing them out explicitly and adding a few comments here and there.

A diagram showing the new inheritance structure for loss functions is shown here:

subclasses

A diagram showing the overall data flow and usage of the column indices for normalizing flows can be seen here:

inga drawio

After Christmas, I'll make a new PR that adds the ability use NormalizingFlows in a generative sense and an example of how this is done. I decided not to include it here to avoid making this PR larger than it already is.

RasmusOrsoe commented 6 months ago

I have now taken a look at the paper and the code. I can not, with my somewhat limited understanding of the method find any faults with the code.

I have a question regarding coordinate_columns and jacobian_columns. The way i read the code these have to be determined by the user when defining the task. How is a user to determine these values are they partly or fully determined by the dimensions of the input data or is this a hyper-parameter of the network?

Hey @Aske-Rosted thanks for taking a look. Let me clarify the role of these two sets of indices. A model architecture, such as DynEdge or INGA has to output a single tensor in GraphNeT. This tensor is passed directly to the Task. Because NormalizingFlows output both a transformed sample y and the jacobian associated with the transformation, jacobian, the output of NormalizingFlow is defined as output = torch.concat([y, jacobian],dim = 1).

When output is passed to Task, it needs to be split back into it's y and jacobian components in order to evaluate the loss function. These two sets of indices allows you to split the tensor correctly in the Task.

The two sets of indices are defined by the NormalizingFlow, by construction, a NormalizingFlow in GraphNeT has these two indices as class attributes that you can expect to exist. In the training example for INGA, you can see them being used directly when defining the task:

    # Building model
    flow = INGA(
        nb_inputs=graph_definition.nb_outputs,
        n_knots=120,
        num_blocks=4,
        b=100,
        c=100,
    )
    task = StandardFlowTask(
        target_labels=graph_definition.output_feature_names,
        loss_function=MultivariateGaussianFlowLoss(),
        coordinate_columns=flow.coordinate_columns,
        jacobian_columns=flow.jacobian_columns,
    )

Let me know if this clears things up.

RasmusOrsoe commented 5 months ago

@AMHermansen @AMHermansen, @Aske-Rosted mentioned at our last dev meeting that he would prefer a third pair of eyes to take a look at this PR. If at all possible, could you set aside some time to give this PR a read?

Things to look for include:

I'm happy to explain any part of this PR in greater detail, either here or in a dedicated zoom call.

RasmusOrsoe commented 4 months ago

@ArturoLlorente @Aske-Rosted thank you for the comments! I have now updated the PR to reflect your feedback.

ArturoLlorente commented 4 months ago

There are some CodeClimate issues that I think are easy to solve (mainly sintax ones). Rather than that, the PR looks good in my eyes. Nice implementation!