Closed owenyoung75 closed 2 years ago
Is kge start examples/toy-complex-ax-search.yaml
working for you?
You can diagnose the problem by running kge dump checkpoint
in the folder of the search and by looking into the subfolders (which hold the individual training jobs). Perhaps the training jobs fail for some reason.
Thanks for the quick response! Yes, kge start ...
works for me. And when I check the checkpoint in each subfolder, e.g. modelname--1vsAll-bce/00001, which looks fine with all parameters included normally.
In the folder of search, though, e.g. modelname--1vsAll-bce/ , kge dump checkpoint
doesn't work, since the file checkpoint_0001.pt
contains only a dict with key values: ['results', 'parameters', 'job_id'], and would get the following error:
Dump of checkpoint: ./checkpoint_00001.pt
Traceback (most recent call last): File "/path-to-libkge/kge", line 33, in
sys.exit(load_entry_point('libkge', 'console_scripts', 'kge')()) File "/path-to-kge-folder/kge/kge/cli.py", line 168, in main dump(args) File "/path-to-kge-folder/kge/kge/util/dump.py", line 34, in dump _dump_checkpoint(args) File "/path-to-kge-folder/kge/kge/util/dump.py", line 93, in _dump_checkpoint print(f"parameter_names: {list(checkpoint['model'][0].keys())}") KeyError: 'model'
Any hint about this?
Please update libkge; I've pushed a fix for this problem this morning.
Solved. Thanks @rgemulla !!
I'm trying to use hyperparam searching provided, but find the
results
values in eachcheckpoint_00001.pt
file is always a list of nones, making the best model identification impossible with all mrr values missing.Is there something I'm doing wrong here?