Provide a more complete example for training and inference?

Perhaps this is because I come from a Tensorflow/Keras background and I'm not familiar with PyTorch, but the main entry point for the code for inference and for training is not clear to me. I've looked at the EXAMPLE.ipynb, and the 9th cell has the code:

for batch in data_loader:
    adjacency_matrix, node_features, distance_matrix, y = batch
    batch_mask = torch.sum(torch.abs(node_features), dim=-1) != 0
    output = model(node_features, batch_mask, adjacency_matrix, distance_matrix, None)
    . . .

What's supposed to go in the dots? For training: I wasn't able to find any obvious (to me) optimizer in the other scripts in the repo. For inference: should I pass the out to the to_predict method of the GraphTransformer class? Does to_predict return an Nx1 array of point estimates of (for example) logS solubility, where N is the number of molecules in the batch?

Is there a lot of missing code in the ... part or is it trivial enough (fewer than 10 or so lines) that you could paste it here? Doesn't have to be anything fancy, maybe just to retrain a model on one of your .csv datasets and perform inference on the same dataset.

Other repos I've come across make the main entry point really clear. Here is an example from another repo:

main.Main(data=sol_data,        # provided data (SMILES, property)
          data_name=data_name,  # dataset's name
          data_units='',        # property's SI units
          bayopt_bounds=bounds, # bounds contraining the Bayesian search of neural architectures
          k_fold_number = 10,   # number of k-folds used for cross-validation
          augmentation = True,  # SMILES augmentation
          outdir = "./data/",  # directory for outputs (plots + .txt files)
          bayopt_n_epochs = 10, # number of epochs for training during Bayesian search
          bayopt_n_rounds = 25, # number of architectures to sample during Bayesian search 
          bayopt_on = True,     # use Bayesian search
          n_gpus = 1,           # number of GPUs to be used
          patience = 25,        # number of epochs with no improvement after which training will be stopped
          n_epochs = 100)       # maximum of epochs for training

Maybe that's a bit too formal, but when I pasted it in a separate notebook and ran it on an AWS GPU it ran without issues. Is there a complete example like that for this repo?

ardigen / MAT

Provide a more complete example for training and inference? #15