Closed DengFeng-Zuo closed 1 month ago
I continue to try modifying the code, adjusting some parameters in the _trainmodel() to fit the multi-component task, but I still encounter errors.
from chemprop.models import multi
def train_model(config, train_dset, val_dset, num_workers, scaler):
# config is a dictionary containing hyperparameters used for the trial
depth = int(config["depth"])
ffn_hidden_dim = int(config["ffn_hidden_dim"])
ffn_num_layers = int(config["ffn_num_layers"])
message_hidden_dim = int(config["message_hidden_dim"])
train_loader = data.build_dataloader(train_dset, num_workers=num_workers, shuffle=True)
val_loader = data.build_dataloader(val_dset, num_workers=num_workers, shuffle=False)
# modified
#mp = nn.BondMessagePassing(d_h=message_hidden_dim, depth=depth)
mcmp = nn.MulticomponentMessagePassing(blocks=[nn.BondMessagePassing() for _ in range(len(smiles_columns))],n_components=len(smiles_columns),)
agg = nn.MeanAggregation()
output_transform = nn.UnscaleTransform.from_standard_scaler(scaler)
# modified
#ffn = nn.RegressionFFN(output_transform=output_transform, input_dim=message_hidden_dim, hidden_dim=ffn_hidden_dim, n_layers=ffn_num_layers)
ffn = nn.RegressionFFN( input_dim=mcmp.output_dim,output_transform=output_transform)
batch_norm = True
metric_list = [nn.metrics.RMSEMetric(), nn.metrics.MAEMetric()]
# modified
#model = models.MPNN(mp, agg, ffn, batch_norm, metric_list)
mcmpnn = multi.MulticomponentMPNN(mcmp,agg,ffn,metrics=metric_list,)
trainer = pl.Trainer(
accelerator="auto",
devices=1,
max_epochs=20, # number of epochs to train for
# below are needed for Ray and Lightning integration
strategy=RayDDPStrategy(),
callbacks=[RayTrainReportCallback()],
plugins=[RayLightningEnvironment()],
)
trainer = prepare_trainer(trainer)
trainer.fit(mcmpnn, train_loader, val_loader)
Error
The error message suggests that the results from each trial don't include val_loss
. This issue is likely related to how you set up the Ray trainer.
The error message suggests that the results from each trial don't include
val_loss
. This issue is likely related to how you set up the Ray trainer.
Thank you for your suggestions. After studying the documentation for the Ray Tune library, I modify the code and successfully ran the existing multi-component hyperparameter optimization task. Below is the final code for the train_model() function, which I hope will be helpful to others.
ray.init()
scheduler = FIFOScheduler()
# Scaling config controls the resources used by Ray
scaling_config = ScalingConfig(
num_workers=1,
use_gpu=False, # change to True if you want to use GPU
)
# Checkpoint config controls the checkpointing behavior of Ray
checkpoint_config = CheckpointConfig(
num_to_keep=1, # number of checkpoints to keep
checkpoint_score_attribute="val_loss", # Save the checkpoint based on this metric
checkpoint_score_order="min", # Save the checkpoint with the lowest metric value
)
run_config = RunConfig(
checkpoint_config=checkpoint_config,
storage_path=hpopt_save_dir / "ray_results", # directory to save the results
)
ray_trainer = TorchTrainer(
lambda config: train_model(
config, train_mcdset, val_mcdset, num_workers, scaler
),
scaling_config=scaling_config,
run_config=run_config,
)
search_alg = HyperOptSearch(
n_initial_points=1, # number of random evaluations before tree parzen estimators
random_state_seed=42,
)
# OptunaSearch is another search algorithm that can be used
# search_alg = OptunaSearch()
tune_config = tune.TuneConfig(
metric="val_loss",
mode="min",
num_samples=2, # number of trials to run
scheduler=scheduler,
search_alg=search_alg,
trial_dirname_creator=lambda trial: str(trial.trial_id), # shorten filepaths
)
tuner = tune.Tuner(
ray_trainer,
param_space={
"train_loop_config": search_space,
},
tune_config=tune_config,
)
# Start the hyperparameter search
results = tuner.fit()
Thanks for sharing!
What are you trying to do? During hyperparameter optimization in multi-component task, an AttributeError seem to occur. All the code was modified based on https://github.com/chemprop/chemprop/blob/main/examples/hpopting.ipynb
I made adjustments for the input-data based on the demo in https://github.com/chemprop/chemprop/blob/main/examples/training_regression_multicomponent.ipynb. to accommodate multi-component task.
But when I run the following code, got an AttributeError.
Previous attempts I've changed the code about the input-data section of hpopting.ipynb
My data
The modified code
Screenshots ran
results = tuner.fit()
the AttributeError occur