gptune / GPTune

Other
64 stars 18 forks source link

Database TL with categorical parameters #10

Closed jaehoonkoo closed 2 years ago

jaehoonkoo commented 2 years ago

Hi @younghyunc

I am implementing TL with the database.

I am getting an error Exception: History database initialization failed when I include categorical parameters: Is the categorical parameter supported with the database TL? Any ideas why the error occur??

Thank you for your help.

...
{'name': 'p6', 'transformer': 'onehot', 'type': 'categorical', 'categories': ['cores', 'threads', 'sockets']}, {'name': 'p7', 'transformer': 'onehot', 'type': 'categorical', 'categories': ['compact', 'scatter', 'balanced', 'none', 'disabled', 'explicit']}], 'output_space': [{'name': 'y', 'type': 'real', 'transformer': 'identity'}], 'loadable_machine_configurations': {'mymachine': {'myprocessor': {'nodes': 1, 'cores': 28}}}}
...
task_parameters_given:  [[100000]]
Traceback (most recent call last):
  File "demo_dtla.py", line 491, in <module>
    main()
  File "demo_dtla.py", line 432, in main
    model_functions[tvalue_] = LoadSurrogateModelFunction(meta_path=None, meta_dict=meta_dict)               
  File "/gpfs/jlse-fs0/users/jkoo/code/gptune/GPTune/gptune.py", line 1149, in LoadSurrogateModelFunction
    model_data = LoadSurrogateModelData(meta_path, meta_dict, tuning_configuration)
  File "/gpfs/jlse-fs0/users/jkoo/code/gptune/GPTune/gptune.py", line 1049, in LoadSurrogateModelData
    historydb = HistoryDB()
  File "/gpfs/jlse-fs0/users/jkoo/code/gptune/GPTune/database.py", line 296, in __init__
    raise Exception("History database initialization failed")
Exception: History database initialization failed
--------------------------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:

  Process name: [[50412,1],0]
  Exit code:    1
--------------------------------------------------------------------------
younghyunc commented 2 years ago

Hi @jke513

Our DB and TLA support categorical variables. Looking at the error message you sent, the error happens when reproducing the surrogate model from the DB (LoadSurrogateModelFunction). Could you share your meta description parameter ("meta_dict" or "meta_path" in JSON or Python Dict) for LoadSurrogateModelFunction, so that I can look into the issue? If you do not wish to upload it publicly, you can send it to me via an email. Thank you!

jaehoonkoo commented 2 years ago

Thank you for a quick response. Let me send you an email with the details.

jaehoonkoo commented 2 years ago

@younghyunc,

I figured my issue with the error. At least, no more History database initialization failed occurred. The problem was with my mistake with loading the jason file.

Thank you.