ARM-software / mango

Parallel Hyperparameter Tuning in Python
Apache License 2.0
335 stars 40 forks source link

Variable type issue in domain_space.py file #26

Closed swapnilsayansaha closed 3 years ago

swapnilsayansaha commented 3 years ago

System: Ubuntu 18.04, Python; 3.8.8 (via conda environment), Tensorflow: 2.4.0 (with GPU), numpy: 1.18.5 Error happens when using TF in conjunction with Mango for neural architecture search.

In https://github.com/ARM-software/mango/blob/master/mango/domain/domain_space.py, line 120:

...
                    # we need to see the index where: domain[x] appears in mapping[x]
                    index = mapping_categorical[x].index(domain[x])
...

The variable mapping_categorical[x] automatically changes to numpy array for some iterations and remains as a list for some iterations. This causes an error: "numpy.ndarray() has no attribute index". I am not sure why the variable would return list for some iterations and a numpy array for other iterations. I made a workaround replacing the lines as follows:

...
                    # we need to see the index where: domain[x] appears in mapping[x]
                    if(type(mapping_categorical[x]).__name__ == 'list'):
                        index = mapping_categorical[x].index(domain[x])
                    else:
                        index = mapping_categorical[x].tolist().index(domain[x])
...
sandeep-iitr commented 3 years ago

I think this issue is due to a different version of numpy. Closing for now.

sandeep-iitr commented 3 years ago

This issue happens sometimes and needs more investigation. The hypothesis is: Same value from the objective is returned for most of the initial cases and this causes GP to fail.

swapnilsayansaha commented 3 years ago

The temporary fix I gave works for me. Maybe for time being we can use it.

tihom commented 3 years ago

This issue happens sometimes and needs more investigation. The hypothesis is: Same value from the objective is returned for most of the initial cases and this causes GP to fail.

@sandeep-iitr could you send an example where this happens. I see that we assume that the categorical variables are of type list and therefore have index method available. So if the param_dict input given by the user has numpy.ndarray then it would fail. However I do not see how it can work for some iterations and not for others. So it would be good to see an example of that.

I have a fix in mind to make it more general but would like to understand the failure mode completely first.

swapnilsayansaha commented 3 years ago

Example of param_dict. The values for each keys are a mixture of list and numpy arrays.

param_dict = {
    'nb_filters': range(2,64),
    'kernel_size': range(2,16),
    'dropout_rate': np.arange(0.0,0.5,0.1),
    'use_skip_connections': [True, False],
    'norm_flag': np.arange(0,1),
    'dil_list': dil_list
}

dil_list in param_dict given as:

min_layer = 3
max_layer = 8
a_list = [1,2,4,8,16,32,64,128,256]
all_combinations = []
dil_list = []
for r in range(len(a_list) + 1):
    combinations_object = itertools.combinations(a_list, r)
    combinations_list = list(combinations_object)
    all_combinations += combinations_list
all_combinations = all_combinations[1:]
for item in all_combinations:
    if(len(item) >= min_layer and len(item) <= max_layer):
        dil_list.append(list(item))

I usually get the error when the score does not change between successive epochs. The temporary fix solves the problem.

tihom commented 3 years ago

@swapnilsayansaha thanks for sharing the example. Since you have a workaround probably not needed for you but we are working on a fix so that other users do not face the same issue.

We would appreciate if you could test the fix on your local setup by installing the fix-issue-26 branch:

git clone -b fix-issue-26  https://github.com/ARM-software/mango.git
pip install .