[Feature request]: Improve interfacing between encoders and output heads

Feature/behavior summary

Currently, there are two things that make configuring output heads a bit of a wildcard:

Difficult to match up the shapes between the embedding dimensionality and the output head input layer
Irregular use of lazy layers for the MLP blocks: this was partly to resolve (1), but has resulted in some engineering complexity (since lazy output heads are created just in time), making it hard to maintain and sometimes difficult to start distributed training.

Request attributes

[X] Would this be a refactor of existing code?
[ ] Does this proposal require new package dependencies?
[X] Would this change break backwards compatibility?
[ ] Does this proposal include a new model?
[ ] Does this proposal include a new dataset?
[ ] Does this proposal include a new task/workflow?

Related issues

No response

Solution description

I don't really have a perfect solution, but my suggestions are:

Remove the option to use lazy modules; this will break examples and probably some tests, but means that there is only one way to initialize models, making it easier for maintenance and less ambiguity in set up.
In the abstract models (e.g. AbstractPyGModel), add an abstract property like encoder_output_dim or something to that effect, that will make it easier for output heads to be created: it'll essentially just use this to calculate the input dimensions for the output head, and the only configuration the output heads will need is the hidden_dim and possibly output_dim.

For the second item, it could be something as simple as:

class AbstractEncoder:

     @abstractproperty
    def encoder_output_dim(self) -> int:
         raise NotImplementedError

And in the concrete case, it might return the dimension of the final layer, or for more complicated (e.g. concatenated tensors), provide the arithmetic to calculate the expected output. We can then refactor OutputHead to rely on this property for the input dimension.

Additional notes

No response

IntelLabs / matsciml