Epistimio / orion

Asynchronous Distributed Hyperparameter Optimization.
https://orion.readthedocs.io
Other
285 stars 52 forks source link

Cannot delete corrupted experiments #894

Open bouthilx opened 2 years ago

bouthilx commented 2 years ago

If the user has an experiment that has an invalid configuration, the command orion db rm will fail to load the experiments, making it impossible to delete it.

$ orion db rm --config orion-config.yaml HuntTest
Traceback (most recent call last):
  File "/home/ml/users/hhuang63/rl/ENV/bin/orion", line 8, in <module>
    sys.exit(main())
  File "/home/ml/users/hhuang63/rl/ENV/lib/python3.7/site-packages/orion/core/cli/__init__.py", line 37, in main
    return orion_parser.execute(argv)
  File "/home/ml/users/hhuang63/rl/ENV/lib/python3.7/site-packages/orion/core/cli/base.py", line 93, in execute
    returncode = function(args)
  File "/home/ml/users/hhuang63/rl/ENV/lib/python3.7/site-packages/orion/core/cli/db/rm.py", line 193, in main
    name=args["name"], version=args.get("version", None)
  File "/home/ml/users/hhuang63/rl/ENV/lib/python3.7/site-packages/orion/core/io/experiment_builder.py", line 349, in load
    return create_experiment(mode=mode, **db_config)
  File "/home/ml/users/hhuang63/rl/ENV/lib/python3.7/site-packages/orion/core/io/experiment_builder.py", line 391, in create_experiment
    experiment.space = _instantiate_space(space)
  File "/home/ml/users/hhuang63/rl/ENV/lib/python3.7/site-packages/orion/core/io/experiment_builder.py", line 486, in _instantiate_space
    return SpaceBuilder().build(config)
  File "/home/ml/users/hhuang63/rl/ENV/lib/python3.7/site-packages/orion/core/io/space_builder.py", line 310, in build
    dimension = self.dimbuilder.build(namespace, expression)
  File "/home/ml/users/hhuang63/rl/ENV/lib/python3.7/site-packages/orion/core/io/space_builder.py", line 243, in build
    dimension = self._build(name, expression)
  File "/home/ml/users/hhuang63/rl/ENV/lib/python3.7/site-packages/orion/core/io/space_builder.py", line 229, in _build
    name, prior
TypeError: Parameter '/learning_rate': 'fidelity' does not correspond to a supported distribution.

We should support deleting them, possibly by avoiding to build the experiment and only rely on storage commands.

bouthilx commented 1 year ago

Hi, I am sorry for the long delay. Are you still facing this issue? It should be possible to delete the experiment using the database object directly:

from orion.storage.legacy import setup_database

db = setup_database(dict(
    type='pickleddb',
    host='db.pkl'
    )
)

experiment_name = 'some name...'
experiment_version = 1

db.remove('experiments', dict(name=experiment_name, version=experiment_version))