ninamiolane commented 1 year ago

What?

There were two implementations of CAN by challenge participants.

If they are indeed the same implementation, we need to merge them into 3 files:

can_layer.py
test_can_layer.py
can_train.ipynb

If they are different implementations, we need to check if one of them can be deleted. Possible reasons to delete:

not faithful to the research paper introducing CAN,
incomplete implementation.

If we decide to keep both of them, the file names and docstrings should be updated to highlight their difference.

Why?

Because it is confusing to have two implementations of CAN without knowing their difference.

Where?

The files to modify are:

topomodelx/nn/cell/can_layer.py
topomodelx/nn/cell/can_layer_bis.py
test/nn/cell/test_can_layer.py
test/nn/cell/test_can_layer_bis.py
tutorials/cell/can_train.ipynb
tutorials/cell/can_train_bis.ipynb

How?

See for example how the duplicate SCN and SCN2 has been solved and the use of the "See Also" sections in their docstrings.

Additionally:

Rename the Python class ending in _v2 in can_layer.py because:
- v2 is not explanatory: it hinders readability
- Python classes need to be in CamelCase and without underscore.
Address the comments left by the challenge's participants in can_layer_bis.py`, and copy-pasted again below:

# Some notes - The attention function provided for us does not normalize the attention coefficients. Should this be done?
# Where should we be able to customize the non-linearities? Seems important for the output. What about the attention non-linearities do we just use what is given?
# I wanted to make this so that without attention it ends up being the Hodge Laplacian network. Maybe ask the contest organizers about this?

mhajij commented 1 year ago

@ninamiolane

the following note was left on the notebook

Notes:

The attention function provided for us does not normalize the attention coefficients.

Should this be done?

Where should we be able to customize the non-linearities?

Seems important for the output.

What about the attention non-linearities do we just use what is given?

I wanted to make this so that without attention it ends up being

the Hodge Laplacian network.

Maybe ask the contest organizers about this?

I agree on the above. Attention in the base layer is used in different context typically used in the lit.

lit : it means learnable parameter that is multiplied by the original vector during training. Sum = 1 for a given neighborhood(x) base layer in TopoModelX : a coefficient in the nbhd matrix and it can be +1 or -1 depending on the orientation.

In general the layer is not faithful to the CAN implementation so I recommend deleting it from the package.

ninamiolane commented 1 year ago

Sounds good. The layer that is not faithful to the CAN implementation can be deleted.

Can you open a PR that does it? Once that PR is merged, and the corresponding CAN code is removed, we can mark this task as done.

ninamiolane commented 1 year ago

Completed through PR #192 and PR #182

pyt-team / TopoModelX

Unify duplicate implementation of CAN #158

What?

Why?

Where?

How?

Notes:

The attention function provided for us does not normalize the attention coefficients.

Should this be done?

Where should we be able to customize the non-linearities?

Seems important for the output.

What about the attention non-linearities do we just use what is given?

I wanted to make this so that without attention it ends up being

the Hodge Laplacian network.

Maybe ask the contest organizers about this?