Closed AstroYuta closed 3 years ago
I haven't try optuna or other frameworks before. In my opinion, changes in config file is not correct, config file is used to build models and other important things, maybe you can try to use cfg.xxx
in tools/train.py
@BIGWangYuDong Great thanks for the reply!
It took me several days to understand scripts, and finally I found a solution to edit train_detector in mmdet/apis/train.py
to return runner, so that I get dict of "loss" and "log_vars" by runner.outputs
.
In this way, my modifications in tools/train.py are the following (cascade rcnn):
def objective(trial): # edited from main()
...
cfg.optimizer.lr = trial.suggest_float("lr", 1e-5, 1e-1, log=True)
...
# add an attribute for visualization convenience
model.CLASSES = datasets[0].CLASSES
runner = train_detector(
model,
datasets,
cfg,
distributed=distributed,
validate=(not args.no_validate),
timestamp=timestamp,
meta=meta)
outputs = runner.outputs <-- HERE
accuracy = (outputs["log_vars"]["s0.acc"] + outputs["log_vars"]["s1.acc"] + outputs["log_vars"]["s2.acc"]) / 3
return accuracy
if __name__ == '__main__':
# main()
study_name = "study_name"
study = optuna.create_study(direction="maximize", study_name=study_name, storage=f"sqlite:///{study_name}.db", load_if_exists=True, pruner=optuna.pruners.PercentilePruner(60))
study.optimize(objective, n_trials=100)
I'm trying but it seems to work well for hyperparameter optimization so far.
Anyway, I would like to ask you one additional question.
When you get dict of "loss" and "log_vars" by runner.outputs
in the line of HERE. Are these losses for the training data?
Specifically, is it possible to obtain losses for the val data?
I guess, runner.outputs
of HERE returns losses for the training data, maybe not.
Well, I have managed to solve the problem. Thanks a lot, @BIGWangYuDong !!
This doesn't work on multiple GPU (distributed) for me. How did you do it.
@AstroYuta have you worked with Optuna and the newer version of MMPretrain? I guess Optuna can be integrated by a hook at the "after_test_epoch" location...
@AstroYuta sorry for reviving an old discussion, but I was wondering if you had a minimal working example for getting Optuna to work with MMDetection? I see above that you managed to solve your problem, though I'm struggling to follow fully how you achieved this based on the code snippets. Any help is greatly appreciated.
Hi forks! Thanks for developing this inspiring project!
I am currently working on instance segmentations using Mask R-CNN or Cascade R-CNN, and very new to this field. Note that my project is trying to identify thousands (>2000) rocks in an image.
I would like to optimize hyper parameters by using frameworks such as optuna (https://github.com/optuna/optuna), but I cannot figure out how to implement those frameworks. Especially, how do you get losses (ex. val/loss) or accuracy in each epoch?
My provisional ways to optimize according to some documents (ex. https://medium.com/pytorch/using-optuna-to-optimize-pytorch-hyperparameters-990607385e36) are:
1. edit hyper parameters in config files using trial class objects of optuna In /mmdetection/configs/base/models/mask_rcnn_r50_fpn.py
before
after
2. Return loss values or accuracy in each epoch (How??) In tools/train.py
3. Run the trials and obtain best hyper parameters In tools/train.py
At last, my questions are:
Thanks!