aims-umich / neorl

NeuroEvolution Optimization with Reinforcement Learning
MIT License
51 stars 18 forks source link

Use of pathos for multiprocessing within python -- unknown error #41

Open pseur opened 1 year ago

pseur commented 1 year ago

joblib does not conserve the global variables during multi-processing. In my environment, I utilized some. As a workaround, I used pathos, which seemed to not have any issue. See for instance within ES: core_list=[] for key in pop: core_list.append(pop[key][0])

        #with joblib.Parallel(n_jobs=self.ncores) as parallel:
        #    fitness=parallel(joblib.delayed(self.fit_worker)(item) for item in core_list)
        try:
            with joblib.Parallel(n_jobs=self.ncores) as parallel:#, prefer="threads" , require='sharedmem'
                fitness=parallel(joblib.delayed(self.fit_worker)(item) for item in core_list)
        except:
            p=pathos.multiprocessing.Pool(processes = self.ncores)
            fitness = p.map(self.fit_worker, core_list)
            p.close()
            p.join()  

However, after some number of samples generated (it could be 10,000 as it could be 30,000), the optimization stops running without throwing errors. It happened with ES, SA, TS and multi-objective variants I implemented.

My workaround right now is to re-initialize all global variables before each candidate evaluation, but it eats some computing time.