Perhaps this is because I come from a Tensorflow/Keras background and I'm not familiar with PyTorch, but the main entry point for the code for inference and for training is not clear to me. I've looked at the EXAMPLE.ipynb, and the 9th cell has the code:
for batch in data_loader:
adjacency_matrix, node_features, distance_matrix, y = batch
batch_mask = torch.sum(torch.abs(node_features), dim=-1) != 0
output = model(node_features, batch_mask, adjacency_matrix, distance_matrix, None)
. . .
What's supposed to go in the dots? For training: I wasn't able to find any obvious (to me) optimizer in the other scripts in the repo. For inference: should I pass the out to the to_predict method of the GraphTransformer class? Does to_predict return an Nx1 array of point estimates of (for example) logS solubility, where N is the number of molecules in the batch?
Is there a lot of missing code in the ... part or is it trivial enough (fewer than 10 or so lines) that you could paste it here? Doesn't have to be anything fancy, maybe just to retrain a model on one of your .csv datasets and perform inference on the same dataset.
Other repos I've come across make the main entry point really clear. Here is an example from another repo:
main.Main(data=sol_data, # provided data (SMILES, property)
data_name=data_name, # dataset's name
data_units='', # property's SI units
bayopt_bounds=bounds, # bounds contraining the Bayesian search of neural architectures
k_fold_number = 10, # number of k-folds used for cross-validation
augmentation = True, # SMILES augmentation
outdir = "./data/", # directory for outputs (plots + .txt files)
bayopt_n_epochs = 10, # number of epochs for training during Bayesian search
bayopt_n_rounds = 25, # number of architectures to sample during Bayesian search
bayopt_on = True, # use Bayesian search
n_gpus = 1, # number of GPUs to be used
patience = 25, # number of epochs with no improvement after which training will be stopped
n_epochs = 100) # maximum of epochs for training
Maybe that's a bit too formal, but when I pasted it in a separate notebook and ran it on an AWS GPU it ran without issues. Is there a complete example like that for this repo?
Perhaps this is because I come from a Tensorflow/Keras background and I'm not familiar with PyTorch, but the main entry point for the code for inference and for training is not clear to me. I've looked at the
EXAMPLE.ipynb
, and the 9th cell has the code:What's supposed to go in the dots? For training: I wasn't able to find any obvious (to me) optimizer in the other scripts in the repo. For inference: should I pass the
out
to theto_predict
method of theGraphTransformer
class? Doesto_predict
return anNx1
array of point estimates of (for example) logS solubility, whereN
is the number of molecules in the batch?Is there a lot of missing code in the
...
part or is it trivial enough (fewer than 10 or so lines) that you could paste it here? Doesn't have to be anything fancy, maybe just to retrain a model on one of your.csv
datasets and perform inference on the same dataset.Other repos I've come across make the main entry point really clear. Here is an example from another repo:
Maybe that's a bit too formal, but when I pasted it in a separate notebook and ran it on an AWS GPU it ran without issues. Is there a complete example like that for this repo?