hyper parameter selection error

coranholmes commented 6 years ago

I try to tune the hyper parameter for whas dataset and get the following error:

Attaching to hyperparamsearch_hp_search_1
hp_search_1  | /usr/local/lib/python2.7/dist-packages/h5py/__init__.py:34: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
hp_search_1  |   from ._conv import register_converters as _register_converters
hp_search_1  | 2018-01-19 08:11:29,163 - __main__ - DEBUG - Parameters: Namespace(box='/box_constraints.0.json', dataset='whas', logdir='/shared/logs', num_epochs=500, num_evals=100, num_folds=3, update_fn='adam')
hp_search_1  | 2018-01-19 08:11:29,163 - __main__ - DEBUG - Loading dataset: whas
hp_search_1  | Traceback (most recent call last):
hp_search_1  |   File "/hyperparam_search.py", line 196, in <module>
hp_search_1  |     x, y, strata = load_dataset(args.dataset)
hp_search_1  |   File "/hyperparam_search.py", line 105, in load_dataset
hp_search_1  |     ds = utils.load_datasets(dataset)['train']
hp_search_1  |   File "/deepsurv/utils.py", line 17, in load_datasets
hp_search_1  |     with h5py.File(dataset_file, 'r') as fp:
hp_search_1  |   File "/usr/local/lib/python2.7/dist-packages/h5py/_hl/files.py", line 271, in __init__
hp_search_1  |     fid = make_fid(name, mode, userblock_size, fapl, swmr=swmr)
hp_search_1  |   File "/usr/local/lib/python2.7/dist-packages/h5py/_hl/files.py", line 101, in make_fid
hp_search_1  |     fid = h5f.open(name, flags, fapl=fapl)
hp_search_1  |   File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper (/tmp/pip-nCYoKW-build/h5py/_objects.c:2840)
hp_search_1  |   File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper (/tmp/pip-nCYoKW-build/h5py/_objects.c:2798)
hp_search_1  |   File "h5py/h5f.pyx", line 78, in h5py.h5f.open (/tmp/pip-nCYoKW-build/h5py/h5f.c:2117)
hp_search_1  | IOError: Unable to open file (Unable to open file: name = 'whas', errno = 2, error message = 'no such file or directory', flags = 0, o_flags = 0)
hyperparamsearch_hp_search_1 exited with code 1

my docker file is as follows:

FROM floydhub/dl-docker:cpu

ADD . /tmp/pip
RUN pip install /tmp/pip/master.zip

RUN \
  echo "h5py==2.7.0\n\
        lifelines==0.9.4\n\
        logger==1.4\n\
        Optunity==1.1.1\n\
        tensorboard-logger==0.0.3\n\
        matplotlib==2.0.0" > /requirements.txt && \
  pip install -U pip --proxy=xxx && \
  pip install -U numpy --proxy=xxx && \
  pip install -r /requirements.txt --proxy=xxx

COPY . /

CMD [ "python", "-u", "/hyperparam_search.py", \
"/shared/logs", \
"whas", \
"/box_constraints.0.json", \
"100", \
"--update_fn", "adam", \
"--num_epochs", "500", \
"--num_fold", "3" ]

Can anyone tell me why I get the error?

jaredleekatzman commented 6 years ago

The data-loading API for the parameter search isn't that fleshed out. The dataset parameter you provided 'whas' isn't understood by the script. You need to instead specify the location of the dataset file you want to load (from within the docker container)

coranholmes commented 6 years ago

@jaredleekatzman hi, thank you for the reply. Now I can run the hyper parameter selection code without errors. I have another small doubt. For the example box_constraints.0.json, why do you set learning rate to [-7,-3], shouldn't it be something like [0.0001, 1]?

jaredleekatzman commented 6 years ago

The learning rate box constraint is actually on a log scale. So it is searching between 10e-7 to 10e-3.

On Jan 21, 2018, at 9:19 PM, Charlotte notifications@github.com wrote:

@jaredleekatzman https://github.com/jaredleekatzman hi, thank you for the reply. Now I can run the hyper parameter selection code without errors. I have another small doubt. For the example box_constraints.0.json, why do you set learning rate to [-7,-3], shouldn't it be something like [0.0001, 1]?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/jaredleekatzman/DeepSurv/issues/33#issuecomment-359307366, or mute the thread https://github.com/notifications/unsubscribe-auth/AFmmZomNdNyzyWrzTyzTarOBhfWOcnxkks5tM_BFgaJpZM4RkHg9.

coranholmes commented 6 years ago

@jaredleekatzman thank you very much for your explanation. May I ask you a further question. How should I interprete the results for hyper parameter selection? I am running the hyper parameter selection on my own dataset and get the following result.

hp_search_1  | 2018-01-22 04:50:57,950 - __main__ - DEBUG - Optimal Parameters: {u'learning_rate': -1.1917578125000001, u'num_nodes': 29.041210937499997, u'num_layers': 2.2513671875, u'dropout': 0.133017578125, u'lr_decay': 0.00025443359375, u'momentum': 0.8799306640625, u'L2_reg': 1.1320117187499998} 
hp_search_1  | 2018-01-22 04:50:57,950 - __main__ - DEBUG - Saving Call log... 
hp_search_1  | OrderedDict([('optimum', **0.7068966357018306**), ...

So I set the hyper parameters for my dataset as follows

{"L2_reg": 1.1320117187499998, "dropout": 0.133017578125, "learning_rate": 0.0643046216782325, "lr_decay": 0.00025443359375, "momentum": 0.8799306640625, "batch_norm": false, "activation": "selu", "standardize": true, "n_in": 11, "hidden_layers_sizes": [29,29]}

Then I run the deepsurv method and get the following results:

deepsurv_1  | Test metrics: {'c_index_bootstrap': {'confidence_interval': (0.6047381184204099, 0.6135943363104468), 'mean': 0.6091662273654284}, 'c_index': **0.6081473364476998**}

the c_index is around 0.6 while in the hyper parameter selection, the c-index is around 0.7. Why would it be like that?

wwwfz commented 6 years ago

@coranholmes are you Chinese，i want to have connet with you and discuss deepsurv

yangrussell commented 5 years ago

Do I have to use docker to do the random hyperparameter search or is there a way to do it in the JupyterNotebook? Thanks!

X1AOX1ONG commented 4 years ago

Hello,excuse me.Can you teach me how to use docker to find the best hyper-parameters?Thanks!

jaredleekatzman / DeepSurv

hyper parameter selection error #33