SimonBlanke / Hyperactive

An optimization and data collection toolbox for convenient and fast prototyping of computationally expensive models.
https://simonblanke.github.io/hyperactive-documentation
MIT License
503 stars 41 forks source link

ValueError: assignment destination is read-only #48

Closed mlittmanabbvie closed 2 years ago

mlittmanabbvie commented 2 years ago

Look into the FAQ of the readme. Can the bug be resolved by one of those solutions? No Describe the bug

When using the joblib as the parallel distributor, if the number of processes / size of them gets too big then an error will be thrown ValueError: assignment destination is read-only (image)

This issue is described here

Code to reproduce the behavior https://github.com/scikit-learn/scikit-learn/issues/5956

Error message from command line ValueError: assignment destination is read-only

System information:

Additional context The issue is not actually with Hyperactive however, the fix for the ValueError issue is to add max_nbytes='50M' to the Parallel instantiation. The issue is that when instantiating Hyperactive, there is no way to pass this argument through to joblib without changing the underlying Hyperactive package.

SimonBlanke commented 2 years ago

Hello @mlittmanabbvie,

thank you for opening this issue and providing detailed information about the problem! :-)

Could you additionally show the code that generated the error with Hyperactive? Hyperactive might already be equipped to solve this problem, but I would like to try it out beforehand.

mlittmanabbvie commented 2 years ago

Hey Simon,

I tried to come up with code that could replicate the issue (that isnt my code) and I am struggling for some reason. Perhaps the reason for the issue is not what I think it is? Here is the stack trace, maybe you might be able to figure out why?


_RemoteTraceback Traceback (most recent call last) _RemoteTraceback: """ Traceback (most recent call last): File "/home/cdsw/.local/lib/python3.8/site-packages/joblib/externals/loky/process_executor.py", line 436, in _process_worker r = call_item() File "/home/cdsw/.local/lib/python3.8/site-packages/joblib/externals/loky/process_executor.py", line 288, in call return self.fn(*self.args, self.kwargs) File "/home/cdsw/.local/lib/python3.8/site-packages/joblib/_parallel_backends.py", line 595, in call return self.func(*args, *kwargs) File "/home/cdsw/.local/lib/python3.8/site-packages/joblib/parallel.py", line 262, in call return [func(args, kwargs) File "/home/cdsw/.local/lib/python3.8/site-packages/joblib/parallel.py", line 262, in return [func(*args, kwargs) File "/home/cdsw/.local/lib/python3.8/site-packages/hyperactive/process.py", line 25, in process optimizer.search( File "/home/cdsw/.local/lib/python3.8/site-packages/hyperactive/optimizers/gfo_wrapper.py", line 123, in search self._optimizer.search( File "/home/cdsw/.local/lib/python3.8/site-packages/gradient_free_optimizers/search.py", line 207, in search self._initialization(init_pos, nth_iter) File "/home/cdsw/.local/lib/python3.8/site-packages/gradient_free_optimizers/times_tracker.py", line 27, in wrapper res = func(self, *args, *kwargs) File "/home/cdsw/.local/lib/python3.8/site-packages/gradient_free_optimizers/search.py", line 107, in _initialization score_new = self._score(init_pos) File "/home/cdsw/.local/lib/python3.8/site-packages/gradient_free_optimizers/times_tracker.py", line 18, in wrapper res = func(self, args, kwargs) File "/home/cdsw/.local/lib/python3.8/site-packages/gradient_free_optimizers/search.py", line 98, in _score return self.score(pos) File "/home/cdsw/.local/lib/python3.8/site-packages/gradient_free_optimizers/results_manager.py", line 31, in _wrapper results_dict = self._obj_func_results(objective_function, para) File "/home/cdsw/.local/lib/python3.8/site-packages/gradient_free_optimizers/results_manager.py", line 14, in _obj_func_results results = objective_function(para) File "/home/cdsw/.local/lib/python3.8/site-packages/hyperactive/optimizers/objective_function.py", line 47, in _model results = self.objective_function(self) File "/home/cdsw/ClusteringAttempthyperlagrangianscoreinonedf.py", line 503, in func_mina total,p = core_func_min(df,exponent,slope,freq_multiplier,c,p,ascending,discount_type,use_pca,last,patid,events_repeat_often,score_col,fin_score_column) File "/home/cdsw/ClusteringAttempthyperlagrangianscoreinonedf.py", line 186, in core_func_min fin2,davies_bouldin,silhouette_avg,calinski_harabasz = score_and_cluster(df=df,exponent=exponent,clust_num=c,freq_multiplier=freq_multiplier,slope=slope,ascending=ascending,discount_type=discount_type,use_pca = use_pca,plot = False, p = p,events_repeat_often = events_repeat_often,score_col = score_col) File "/home/cdsw/ClusteringAttempthyperlagrangianscoreinonedf.py", line 359, in score_and_cluster cluster_prepped = prep_for_cluster(df,exponent,clust_num,freq_multiplier,slope,ascending,discount_type,patid = patid,event_column=event_column, event_or_eventtrans = event_or_eventtrans, times_reweight = times_reweight, score_col = score_col) File "/home/cdsw/ClusteringAttempthyperlagrangianscoreinonedf.py", line 340, in prep_for_cluster df.loc[:,'score'+ str(times_reweight)] = df1.loc[:,'score'+ str(times_reweight)] File "/home/cdsw/.local/lib/python3.8/site-packages/pandas/core/indexing.py", line 692, in setitem iloc._setitem_with_indexer(indexer, value, self.name) File "/home/cdsw/.local/lib/python3.8/site-packages/pandas/core/indexing.py", line 1635, in _setitem_with_indexer self._setitem_with_indexer_split_path(indexer, value, name) File "/home/cdsw/.local/lib/python3.8/site-packages/pandas/core/indexing.py", line 1676, in _setitem_with_indexer_split_path self._setitem_single_column(ilocs[0], value, pi) File "/home/cdsw/.local/lib/python3.8/site-packages/pandas/core/indexing.py", line 1817, in _setitem_single_column self.obj._iset_item(loc, ser) File "/home/cdsw/.local/lib/python3.8/site-packages/pandas/core/frame.py", line 3223, in _iset_item NDFrame._iset_item(self, loc, value) File "/home/cdsw/.local/lib/python3.8/site-packages/pandas/core/generic.py", line 3821, in _iset_item self._mgr.iset(loc, value) File "/home/cdsw/.local/lib/python3.8/site-packages/pandas/core/internals/managers.py", line 1110, in iset blk.set_inplace(blk_locs, value_getitem(val_locs)) File "/home/cdsw/.local/lib/python3.8/site-packages/pandas/core/internals/blocks.py", line 363, in set_inplace self.values[locs] = values ValueError: assignment destination is read-only """

The above exception was the direct cause of the following exception:

ValueError Traceback (most recent call last)

in 254 score_col = None) 255 --> 256 results, resultscopy = main(**kwargs) 257 258 #results = main(**kwargs) ~/ClusteringAttempthyperlagrangianscoreinonedf.py in main(**kwargs) 728 start = datetime.now() 729 print(f'START CLOCK: {start}\n________________________________\n') --> 730 df,transition_limit = dockkes_clustering(**kwargs) 731 print(type(df_copy)) 732 df_copy = df.copy(deep = True) ~/ClusteringAttempthyperlagrangianscoreinonedf.py in dockkes_clustering(df, ) 709 else: 710 tried_everything = False --> 711 prev = perform_clustering(df, ~/ClusteringAttempthyperlagrangianscoreinonedf.py in perform_clustering(df, ) 915 return 916 --> 917 finp = multiprocess_clusters(df,ascending,, ~/ClusteringAttempthyperlagrangianscoreinonedf.py in multiprocess_clusters(df, ) 604 print(f'\nBEST NUMBER OF CLUSTERS = {finp.max_cluster} Clusters:\n----------------------------\nOptimizing now...\n') 605 --> 606 return optimize(df) 607 608 ~/ClusteringAttempthyperlagrangianscoreinonedf.py in optimize(df, ) 527 --> 528 h.run() 529 530 print('THE BEST PARAMETERS:\n',{k:v for k,v in h.best_para(func_mina).items() if k!= 'df'}) ~/.local/lib/python3.8/site-packages/hyperactive/hyperactive.py in run(self, max_time, _test_st_backend) 178 progress_board.open_dashboard() 179 --> 180 self.results_list = run_search( 181 self.process_infos, self.distribution, self.n_processes 182 ) ~/.local/lib/python3.8/site-packages/hyperactive/run_search.py in run_search(search_processes_infos, distribution, n_processes) 49 (distribution, process_func), dist_paras = _get_distribution(distribution) 50 ---> 51 results_list = distribution( 52 process_func, process_infos, n_processes, **dist_paras 53 ) ~/.local/lib/python3.8/site-packages/hyperactive/distribution.py in joblib_wrapper(process_func, search_processes_paras, n_processes, **kwargs) 32 def joblib_wrapper(process_func, search_processes_paras, n_processes, **kwargs): 33 jobs = [delayed(process_func)(**info_dict) for info_dict in search_processes_paras] ---> 34 results = Parallel(n_jobs=n_processes, **kwargs)(jobs) 35 36 return results ~/.local/lib/python3.8/site-packages/joblib/parallel.py in __call__(self, iterable) 1054 1055 with self._backend.retrieval_context(): -> 1056 self.retrieve() 1057 # Make sure that we get a last message telling us we are done 1058 elapsed_time = time.time() - self._start_time ~/.local/lib/python3.8/site-packages/joblib/parallel.py in retrieve(self) 933 try: 934 if getattr(self._backend, 'supports_timeout', False): --> 935 self._output.extend(job.get(timeout=self.timeout)) 936 else: 937 self._output.extend(job.get()) ~/.local/lib/python3.8/site-packages/joblib/_parallel_backends.py in wrap_future_result(future, timeout) 540 AsyncResults.get from multiprocessing.""" 541 try: --> 542 return future.result(timeout=timeout) 543 except CfTimeoutError as e: 544 raise TimeoutError from e /usr/local/lib/python3.8/concurrent/futures/_base.py in result(self, timeout) 442 raise CancelledError() 443 elif self._state == FINISHED: --> 444 return self.__get_result() 445 else: 446 raise TimeoutError() /usr/local/lib/python3.8/concurrent/futures/_base.py in __get_result(self) 387 if self._exception: 388 try: --> 389 raise self._exception 390 finally: 391 # Break a reference cycle with the exception in self._exception ValueError: assignment destination is read-only
mlittmanabbvie commented 2 years ago

Doesn't seem to be an issue with Hyperactive. Apologies!