Problem when defining high number of biomarkers

Hello!

I'm trying to run PySustain with my data (1129 observations and 38 biomarkers), but maybe for the high number of biomarkers, the algorithm does not move forward (even after 10 hours) on the print "Finding ML solution to 1 cluster problem". I found, inserting some print into the code to debug it, that the "heavy" code is in AbstractSustain.py into the _find_ml(): in particular for these lines of code:

partial_iter = partial(self._find_ml_iteration, sustainData)
pool_output_list = self.pool.map(partial_iter, range(self.N_startpoints))
if ~isinstance(pool_output_list, list):
             pool_output_list = list(pool_output_list)

I think that the map is very slow: the execution hangs on "_list(pool_outputlist)". Do you have any idea how to resolve this problem ? I tried also generating simulated data (with 1129 observations and 38 biomarkes) but nothing happened.

Thank you in advance.

ucl-pond / pySuStaIn

Problem when defining high number of biomarkers #12