Open RasmusOrsoe opened 7 months ago
I have now taken a look at the paper and the code. I can not, with my somewhat limited understanding of the method find any faults with the code.
I have a question regarding
coordinate_columns
andjacobian_columns
. The way i read the code these have to be determined by the user when defining the task. How is a user to determine these values are they partly or fully determined by the dimensions of the input data or is this a hyper-parameter of the network?
Hey @Aske-Rosted thanks for taking a look. Let me clarify the role of these two sets of indices. A model architecture, such as DynEdge
or INGA
has to output a single tensor in GraphNeT. This tensor is passed directly to the Task
. Because NormalizingFlow
s output both a transformed sample y
and the jacobian associated with the transformation, jacobian
, the output of NormalizingFlow
is defined as output = torch.concat([y, jacobian],dim = 1)
.
When output
is passed to Task
, it needs to be split back into it's y
and jacobian
components in order to evaluate the loss function. These two sets of indices allows you to split the tensor correctly in the Task
.
The two sets of indices are defined by the NormalizingFlow
, by construction, a NormalizingFlow
in GraphNeT has these two indices as class attributes that you can expect to exist. In the training example for INGA
, you can see them being used directly when defining the task:
# Building model
flow = INGA(
nb_inputs=graph_definition.nb_outputs,
n_knots=120,
num_blocks=4,
b=100,
c=100,
)
task = StandardFlowTask(
target_labels=graph_definition.output_feature_names,
loss_function=MultivariateGaussianFlowLoss(),
coordinate_columns=flow.coordinate_columns,
jacobian_columns=flow.jacobian_columns,
)
Let me know if this clears things up.
@AMHermansen @AMHermansen, @Aske-Rosted mentioned at our last dev meeting that he would prefer a third pair of eyes to take a look at this PR. If at all possible, could you set aside some time to give this PR a read?
Things to look for include:
I'm happy to explain any part of this PR in greater detail, either here or in a dedicated zoom call.
@ArturoLlorente @Aske-Rosted thank you for the comments! I have now updated the PR to reflect your feedback.
There are some CodeClimate issues that I think are easy to solve (mainly sintax ones). Rather than that, the PR looks good in my eyes. Nice implementation!
This PR adds compatibility for normalizing flows to graphnet. Specifically, this PR does the following:
NormalizingFlow
ingraphnet.models.flows
as a new generic model class which all normalizing flows are intended to inherit from. These models are assumed to output a single tensor containing the transformed coordinates and the jacobian associated with the transformation. For book-keeping reasons, the column indices that identifies which columns are transformed coordinates and jacobians, these models have propertiescoordinate_columns
andjacobian_columns
. These properties are used to split the tensor for loss computation (see diagram).INGA
which is similar to what is presented in https://arxiv.org/abs/1906.04032 . It relies on splines and the spline functions are currently put ingraphnet.models.flows.spline_blocks
and can be used both inside and outside the context ofNormalizingFlows
(although likely most relevant there).INGA
underexamples/04_training/05_train_normalizing_flow.py
. Here the syntax of training the normalizing flow is shown. The example trainsINGA
to learn a mapping from the rather complicated distribution of the pulse attributes (xyz, time, charge, etc) into a multivariate gaussian distribution.LossFunction
no longer has an abstract_forward
method; this is left for implementation of further problem specific subclasses. This is introduced here because not all machine learning paradigms follow our usualprediction
,target
argument structure; which is indeed the case for normalizing flows.LossFunction
subclass calledStandardLossFunction
which all of our current loss functions now inherit from. These loss functions all follow theprediction
,target
argument structure.LossFunction
subclassFlowLossFunction
which follow theprediction
,jacobian
argument structure.MultivariateGaussianFlowLoss
, subclassed ofFlowLossFunction
, which calculates the nllh of a multivariate Gaussian.StandardModel
that ties everything together. While working on this PR i noticed a few list comprehensions were hard to read; so I took the liberty of writing them out explicitly and adding a few comments here and there.A diagram showing the new inheritance structure for loss functions is shown here:
A diagram showing the overall data flow and usage of the column indices for normalizing flows can be seen here:
After Christmas, I'll make a new PR that adds the ability use
NormalizingFlows
in a generative sense and an example of how this is done. I decided not to include it here to avoid making this PR larger than it already is.