optuna / optuna-dashboard

Real-time Web Dashboard for Optuna.
https://optuna-dashboard.readthedocs.io/en/latest/
Other
521 stars 86 forks source link

Dashboard is not updating and showing NaN's #955

Open CraigRichards opened 2 months ago

CraigRichards commented 2 months ago

Description

This is my second attempt at running Optuna and this time I am getting strange behaviour. First time in another project worked like a treat.

The dashboard is not updating, I need to close it and open back up to see new studies. When I open a study I see NaN values for Hyperparameter importance.

How to Reproduce

  1. Optuna's objective function is '...'.
  2. Run optuna-dashboard with '...'
  3. Open '...' page, then click '...'.
  4. An error occurs.

My code to run is def objective(trial):

Define hyperparameters to search

img_size = trial.suggest_categorical('img_size', [224, 256, 384])
models = ['convnext_tiny', 'resnet50', 'efficientnet_b1', 'efficientnet_b2', 'efficientnet_b3']
selected_model = trial.suggest_categorical('model', models)
resize_method = trial.suggest_categorical('resize_method', ['Resize']) #, 'RandomResizedCrop', 'ResizeSquish', 'ResizeCrop'])
batch_transforms = trial.suggest_categorical('batch_transforms', [True, False])
lr = trial.suggest_float('lr', 1e-5, 1e-1, log=True)
dropout = trial.suggest_float('dropout', 0.1, 0.5)
epochs = trial.suggest_int('epochs', 5, 10)
weight_decay = trial.suggest_float('weight_decay', 1e-6, 1e-2, log=True)
batch_size = trial.suggest_categorical('batch_size', [4, 8, 16, 32, 64])
optimizer_name = trial.suggest_categorical('optimizer', ['Adam'])

# Print the parameters for this trial
print(f"Trial parameters: {trial.params}")

# Define transformations based on suggested hyperparameters
if resize_method == 'Resize':
    item_tfms = [Resize(img_size)]
elif resize_method == 'RandomResizedCrop':
    item_tfms = [RandomResizedCrop(img_size, min_scale=0.9)]
elif resize_method == 'ResizeSquish':
    item_tfms = [Resize(img_size, method='squish')]
else:
    item_tfms = [Resize(img_size, method='crop')]

batch_tfms = [
    *aug_transforms(  # Apply augmentations cautiously
        size=img_size,
        flip_vert=False,  # Avoid flipping vertically, important for medical data
        max_rotate=10,    # Limit rotations to ±10 degrees
        max_zoom=1.1,     # Slight zooming
        max_lighting=0.2, # Light contrast/brightness adjustment
        p_affine=0.5,     # Probability of affine transform
        p_lighting=0.3    # Probability of lighting transform
    ),
    Normalize.from_stats(*imagenet_stats)  # Normalize based on ImageNet mean/std
] if batch_transforms else [Normalize.from_stats(*imagenet_stats)]

# Create DataLoaders
dls = create_datablock(
    'Sagittal T1',
    train_data,
    label_map,
    item_tfms=item_tfms,
    batch_tfms=batch_tfms,
    bs=batch_size)

# Check if the DataLoader is loaded correctly
dls.show_batch(max_n=4, figsize=(8, 8))

the_model = timm.create_model(selected_model, pretrained=True, num_classes=dls.c)
# Define the model with dropout
def create_model(the_model=the_model, dropout=0.5):
    # Get the number of features from the classifier layer
    num_ftrs = the_model.get_classifier().in_features

    # Replace the classifier with dropout and a linear layer
    # The classifier layer is replaced regardless of its internal name
    the_model.reset_classifier(num_classes=dls.c)  # Resets the classifier dynamically
    the_model.classifier = nn.Sequential(
        nn.Dropout(p=dropout),  # Add dropout with the rate defined in your hyperparameter search
        nn.Linear(num_ftrs, dls.c)
    )

    return the_model

# Create the learner
patience = 3
learn = Learner(
    dls, 
    create_model(the_model, dropout), 
    metrics=accuracy,
    wd=weight_decay,
    # cbs=[EarlyStoppingCallback(monitor='accuracy', patience=patience)]
    ).to_bf16()

# learn.lr_find(suggest_funcs=(slide, valley))

# Choose optimizer
if optimizer_name == 'Adam':
    learn.opt_func = Adam
elif optimizer_name == 'SGD':
    learn.opt_func = SGD
elif optimizer_name == 'RMSprop':
    learn.opt_func = RMSProp

# Train the model
learn.fit(epochs, lr)

# Initialize accuracy
accuracy_result = 0.0

# Evaluate the model
try:
    # Evaluate the model
    validation_results = learn.validate()
    if validation_results and len(validation_results) > 1:
        accuracy_result = validation_results[1]
except Exception as e:
    print(f"Validation failed: {e}")
    accuracy_result = 0.0  # Default value if validation fails

return accuracy_result

Create a study and optimize the objective function

study = optuna.create_study( storage="sqlite:///db.sqlite3", direction='maximize', study_name='Sagittal T1 - model search - 2') study.optimize(objective, n_trials=50)

Print the best hyperparameters

print(study.best_params)

image

Python version

3.11.5

Optuna version

3.6.1

optuna-dashboard version or git revision

0.16.1

Web browser

Inside of vscode

github-actions[bot] commented 12 hours ago

This issue has not seen any recent activity.