ARM-software / mango

Parallel Hyperparameter Tuning in Python
Apache License 2.0
335 stars 40 forks source link

Python int too large to convert to C long #109

Closed AJFeng closed 10 months ago

AJFeng commented 10 months ago

param_dict = {"s": range(50,600), "Ts": range(20,100), "Ts2": range(20,100), "c":range(1,100), "n_hidden1":range(100,1000), "n_hidden2":range(10,100), "n_hidden3":range(5,30), "selected_range": [0.5,0.6,0.7,0.8,0.9]}

conf_Dict = dict() conf_Dict['batch_size'] = 1 conf_Dict['num_iteration'] = 100 conf_Dict['domain_size'] = 50000 conf_Dict['initial_random'] = 1

@scheduler.parallel(n_jobs=2) def objective(s,Ts,Ts2,c,n_hidden1,n_hidden2,n_hidden3,selected_range):

global X, Y, N, p

f1s=[]
accs=[]
all_common_numbers=[]
all_loss=[]

return random.randint(1, 100000)

error: mango_results = tuner.minimize()

File ~\AppData\Local\anaconda3\envs\Ecoli\Lib\site-packages\mango\tuner.py:160 in minimize return self.run()

File ~\AppData\Local\anaconda3\envs\Ecoli\Lib\site-packages\mango\tuner.py:147 in run self.results = self.runBayesianOptimizer()

File ~\AppData\Local\anaconda3\envs\Ecoli\Lib\site-packages\mango\tuner.py:208 in runBayesianOptimizer X_list, Y_list, X_tried = self.run_initial()

File ~\AppData\Local\anaconda3\envs\Ecoli\Lib\site-packages\mango\tuner.py:184 in run_initial X_tried = self.ds.get_random_sample(self.config.initial_random)

File ~\AppData\Local\anaconda3\envs\Ecoli\Lib\site-packages\mango\domain\domain_space.py:45 in get_random_sample return self._get_random_sample(size)

File ~\AppData\Local\anaconda3\envs\Ecoli\Lib\site-packages\mango\domain\domain_space.py:63 in _get_random_sample domain_list = list(BatchParameterSampler(self.param_dict, n_iter=size))

File ~\AppData\Local\anaconda3\envs\Ecoli\Lib\site-packages\mango\domain\batch_parameter_sampler.py:49 in iter for i in sample_without_replacement(grid_size, n_iter,

File sklearn\utils_random.pyx:218 in sklearn.utils._random.sample_without_replacement

OverflowError: Python int too large to convert to C long

sandeep-iitr commented 10 months ago

Hi, I was able to start training as you can see in the collab notebook below: https://colab.research.google.com/drive/1rxwzXMPIFHEcOU6dz_gev4rmiQhcrHZC?usp=sharing

But, my suggestion will be to reduce the search space complexity. This is an extremely large search space. Search space size =550*80*80*100*900*90*25*5 = 3*e15 values, which are too large to handle during sampling, and to get good results.

Something as below:

param_dict = {"s": range(50,600, 30),
"Ts": range(20,100, 10),
"Ts2": range(20,100, 10),
"c":range(1,100, 10),
"n_hidden1":range(100,1000, 30),
"n_hidden2":range(10,100, 10),
"n_hidden3":range(5,30, 5),
"selected_range": [0.5,0.6,0.7,0.8,0.9]}

You can start from a small space, and then fine-tune the model near the best parameters which you previously found.

AJFeng commented 10 months ago

Thank you so much for the quick response. Yes, this one works and really makes sense to me.