chemprop / chemprop

Message Passing Neural Networks for Molecule Property Prediction
https://chemprop.csail.mit.edu
Other
1.78k stars 587 forks source link

[v2 QUESTION]: AttributeError in hyperparameter ptimization in multi-component tasks #1052

Closed DengFeng-Zuo closed 1 month ago

DengFeng-Zuo commented 1 month ago

What are you trying to do? During hyperparameter optimization in multi-component task, an AttributeError seem to occur. All the code was modified based on https://github.com/chemprop/chemprop/blob/main/examples/hpopting.ipynb

I made adjustments for the input-data based on the demo in https://github.com/chemprop/chemprop/blob/main/examples/training_regression_multicomponent.ipynb. to accommodate multi-component task.

But when I run the following code, got an AttributeError.

Previous attempts I've changed the code about the input-data section of hpopting.ipynb

My data data

The modified code 微信截图1 微信截图2

Screenshots ran results = tuner.fit() the AttributeError occur 微信截图2 微信截图3 微信截图5

DengFeng-Zuo commented 1 month ago

I continue to try modifying the code, adjusting some parameters in the _trainmodel() to fit the multi-component task, but I still encounter errors.

from chemprop.models import multi

def train_model(config, train_dset, val_dset, num_workers, scaler):

    # config is a dictionary containing hyperparameters used for the trial
    depth = int(config["depth"])
    ffn_hidden_dim = int(config["ffn_hidden_dim"])
    ffn_num_layers = int(config["ffn_num_layers"])
    message_hidden_dim = int(config["message_hidden_dim"])

    train_loader = data.build_dataloader(train_dset, num_workers=num_workers, shuffle=True)
    val_loader = data.build_dataloader(val_dset, num_workers=num_workers, shuffle=False)

    # modified
    #mp = nn.BondMessagePassing(d_h=message_hidden_dim, depth=depth)
    mcmp = nn.MulticomponentMessagePassing(blocks=[nn.BondMessagePassing() for _ in range(len(smiles_columns))],n_components=len(smiles_columns),)

    agg = nn.MeanAggregation()
    output_transform = nn.UnscaleTransform.from_standard_scaler(scaler)

    # modified
    #ffn = nn.RegressionFFN(output_transform=output_transform, input_dim=message_hidden_dim, hidden_dim=ffn_hidden_dim, n_layers=ffn_num_layers)
    ffn = nn.RegressionFFN( input_dim=mcmp.output_dim,output_transform=output_transform)

    batch_norm = True
    metric_list = [nn.metrics.RMSEMetric(), nn.metrics.MAEMetric()]

    # modified
    #model = models.MPNN(mp, agg, ffn, batch_norm, metric_list)
    mcmpnn = multi.MulticomponentMPNN(mcmp,agg,ffn,metrics=metric_list,)

    trainer = pl.Trainer(
        accelerator="auto",
        devices=1,
        max_epochs=20, # number of epochs to train for
        # below are needed for Ray and Lightning integration
        strategy=RayDDPStrategy(),
        callbacks=[RayTrainReportCallback()],
        plugins=[RayLightningEnvironment()],
    )

    trainer = prepare_trainer(trainer)
    trainer.fit(mcmpnn, train_loader, val_loader)

Error 微信截图_2024101 微信截图

shihchengli commented 1 month ago

The error message suggests that the results from each trial don't include val_loss. This issue is likely related to how you set up the Ray trainer.

DengFeng-Zuo commented 1 month ago

The error message suggests that the results from each trial don't include val_loss. This issue is likely related to how you set up the Ray trainer.

Thank you for your suggestions. After studying the documentation for the Ray Tune library, I modify the code and successfully ran the existing multi-component hyperparameter optimization task. Below is the final code for the train_model() function, which I hope will be helpful to others. 微信截图_20241011203921

ray.init()

scheduler = FIFOScheduler()

# Scaling config controls the resources used by Ray
scaling_config = ScalingConfig(
    num_workers=1,
    use_gpu=False, # change to True if you want to use GPU
) 

# Checkpoint config controls the checkpointing behavior of Ray
checkpoint_config = CheckpointConfig(
    num_to_keep=1, # number of checkpoints to keep
    checkpoint_score_attribute="val_loss", # Save the checkpoint based on this metric
    checkpoint_score_order="min", # Save the checkpoint with the lowest metric value
)

run_config = RunConfig(
    checkpoint_config=checkpoint_config,
    storage_path=hpopt_save_dir / "ray_results", # directory to save the results
)

ray_trainer = TorchTrainer(
    lambda config: train_model(
        config, train_mcdset, val_mcdset, num_workers, scaler
    ),
    scaling_config=scaling_config,
    run_config=run_config,
)

search_alg = HyperOptSearch(
    n_initial_points=1, # number of random evaluations before tree parzen estimators
    random_state_seed=42,
)

# OptunaSearch is another search algorithm that can be used
# search_alg = OptunaSearch() 

tune_config = tune.TuneConfig(
    metric="val_loss",
    mode="min",
    num_samples=2, # number of trials to run
    scheduler=scheduler,
    search_alg=search_alg,
    trial_dirname_creator=lambda trial: str(trial.trial_id), # shorten filepaths

)

tuner = tune.Tuner(
    ray_trainer,
    param_space={
        "train_loop_config": search_space,
    },
    tune_config=tune_config,
)

# Start the hyperparameter search
results = tuner.fit()
JacksonBurns commented 1 month ago

Thanks for sharing!