Feedback about documentation

zhiyil1230 commented 1 year ago

It would be good to have an overview of the models available in the readme
Maybe introduce in readme about what does each folder contains?
We could also pass config file with command line.
would be good to see tutorials about how to use/switch or connect different pyg layers and modules in nn/pyg_layers

s-maddrellmander commented 1 year ago

Some assorted thought on GOLI after a few weeks developing.

Abstraction makes things neatly reusable but makes understanding where to modify classes and variables complex - ends up with many files open and following references through the code. Not in itself an issue but makes it hard for a new user to understand what modifications are doing if they aren't obviously covered in the config.
I find it a bit confusing thinking as a user how I'm expected to interact with the framework - this may be in part because of developing + adding functionality while also running - but I've found this a bit opaque.
Detailed documentation about how to modify models / configs + training would help - I think this is being developed atm

joao-alex-cunha commented 1 year ago

One thing that in my opinion would be useful, is for the user to understand what is the design space of all possible models that can be designed with GOLI. Specifically how the config file defines the design space. I think a good way to achieve this is through a diagram that visually explains this. Showing how a list of input tensors, is processed by the pre_nn, pre_nn_edges, pe_encoders, gnn, post-nn, task heads and metrics. This will help make clear the connections between the previous modules.

callumm-graphcore commented 1 year ago

Random thoughts:

Having configs only in YAML means you don't have IDE tooling when you edit them, so it would be nice if it was possible to have a Python object which could be used for configuration
Supporting both DGL and PyG seems excessive and leads to duplicated functionality, but there might be good reasons to do so such as accessing functionality that is accessible in one but not the other
Baking general good practice like norms and dropout into each layer is a great idea but makes it harder to customise some things (for example, you can't choose between "pre-" and "post-" layer norm)
It's hard not to feel like the wheel is being reinvented in a few places (I'm not convinced there's an advantage to using FCLayer over plain nn.Linear?)
I'm sure you're aware of this, but some work needs to be done to convert it to a "library" in the sense of it being a set of components you can use on its own. At the moment, it doesn't seem to be possible to use Goli without doing pip install -e . and living inside it

joao-alex-cunha commented 1 year ago

I'm removing the ticket from this sprint, but we can leave it open to collect more feedback in the future

DomInvivo commented 1 year ago

General documentation #181

[x] Overview of the FullGraphMultiTaskNetwork, as a drawing #247
[x] Introduce the content of each folder in the Readme #289
[x] Update the tutorials about config files #283
[x] Overview of the design space #283
[ ] Document make_mup_base_kwargs and how muP works #205

Usability

[ ] Make cli for different options #162
[ ] Provide one config for reproducing GPS++
[x] Kill DGL support #233

For the other comments:

Having configs only in YAML means you don't have IDE tooling when you edit them, so it would be nice if it was possible to have a Python object which could be used for configuration

You can pass a dictionary, you just have to create your own main file

Baking general good practice like norms and dropout into each layer is a great idea but makes it harder to customise some things (for example, you can't choose between "pre-" and "post-" layer norm)

I don't think it's necessary to have normalization before a layer. In the case of the MLP and FeedForwardNN, there's the first_normalization option.

I'm not convinced there's an advantage to using FCLayer over plain nn.Linear? FCLayer has a lot more options, norm, dropout, activation. In some cases, it is used without any of these, and replicates the Linear.

The reason it is used is that it defaults to the mup initialization instead of the pytorch one.

some work needs to be done to convert it to a "library" in the sense of it being a set of components you can use on its own

This is already done internally. We have a pipeline in-place to package it and we will do it once we reach a stable version

callumm-graphcore commented 1 year ago

You can pass a dictionary, you just have to create your own main file

The problem for me has more to do with discoverability of the features of the library. It's easier to work out what you need to provide for e.g. a dataclass than a dictionary that needs to have some particular keys.

Some work needs to be done to convert it to a "library" in the sense of it being a set of components you can use on its own

Where is this pipeline? Can we try it out and try to find out what might be missing?

I don't think it's necessary to have normalization before a layer. In the case of the MLP and FeedForwardNN, there's the first_normalization option.

This is the specific paper I was thinking of, but I'll admit I haven't read it in enough detail to know if its ideas are useful in this context: https://arxiv.org/abs/2002.04745

DomInvivo commented 1 year ago

Answers to @callumm-graphcore

The problem for me has more to do with discoverability of the features of the library. It's easier to work out what you need to provide for e.g. a dataclass than a dictionary that needs to have some particular keys.

I understand your point. But there are many advantages to a YAML / dictionary.

YAML is quite human-readable.
When dealing classes that have tons of parameters, such as lightning.Trainer, with the yaml you can simply use any feature available within the Trainer. If you use a dataclass, you will have to do a signature check of such classes.
It will make it a bit harder to deal with nested parameters, and with the wandb sweeps.

Of course, it would be great if we can transition to dataclass at some point to have IDE suggestions, but it would require lots of work.

Where is this pipeline? Can we try it out and try to find out what might be missing?

@hadim perhaps you can answer this one? He's asking for the pipeline to package Goli into a library on pip / conda.

For the normalization before the layer, This is the specific paper I was thinking of, but I'll admit I haven't read it in enough detail to know if its ideas are useful in this context: https://arxiv.org/abs/2002.04745

Since this paper is specific to the Transformer layer, then this change only needs to happen inside GPS. But again, we're using a hybrid architecture, so it makes our model way less sensitive to the Transformer stability.

hadim commented 1 year ago

As long as the repo is pip-compatible, it should be fine for conda packaging as well. I'll try to package the newest version with conda and let you know.

hadim commented 1 year ago

When are you planning to remove dgl? It would be best to not have it for packaging (especially for the open source release since dgl is not on conda forge yet).

DomInvivo commented 1 year ago

@hadim

When are you planning to remove dgl? Soon. Definitly before release.

Note that the IPU stuff requires Pip, poptorch is still not available on conda.

hadim commented 1 year ago

A quick update here:

the conda packaging is working after https://github.com/valence-discovery/goli/pull/254
I have opened a ticket suggesting reviewing all the goli deps and also potentially address some warnings seen during the tests: https://github.com/valence-discovery/goli/issues/255

DomInvivo commented 1 year ago

I will close this issue as it simply regroups many other issues.

datamol-io / graphium

Feedback about documentation #196