Closed jiyoonlim123 closed 2 years ago
Hi @jiyoonlim123. 'torch.utils.checkpoint' is not used in ogb_eff/ogbn_protiens. Sorry that I forgot to clean up the unused code snippets. I just cleaned it up again to make it more clear.
Thanks for quick reply.
Then InvertibleModuleWrapper
and `InvertibleCheckpointFunction' are being used in the current code. These two modules are used to store output of the layer. Then it would store's |L| outputs.
However, if use reversible connection, we can compute inputs from outputs of last layer. Thus, it just needs O(1) outputs.
Am I misunderstand the code behavior? or Is there are any reasons that store every output of layer?
The memory of node features in every layers are free: https://github.com/lightaime/deep_gcns_torch/blob/1b840faed83363098587eacac111a6317a927195/eff_gcn_modules/rev/gcn_revop.py#L58. The other inputs (adj and edge features) are saved but they do not change across layers.
I understand, but what I'm considering is the output of layers. Don't we only need the output value of last layer not the output value of every layer? Since, every layer stores the output value, this model's memory consumption would be O(L).
Hi @jiyoonlim123. Do you see the memory consumption increases linearly as you increase the number of layers?
According to 'Training Graph Neural Networks with 1000 Layers', checkpoint consumes more memory than RevConv. However, there is an invertible checkpoint on your code.
Is it correct that
invertible checkpoint
does the work of 'torch.utils.checkpoint' so that it is compatible with RevConv? If yes, is there any reason for adding this feature even checkpoint consumes more memory than Reversible connection?