Model fitting error: " Can't pickle local object 'make_subject_model.<locals>.lda_logp' "

pdmadeira commented 3 years ago

Hey, I am working on my MSc Thesis and found your toolbox, which appeared to be very useful for my study. Before anything, I would like to congratulate you on the development of the tool. Turns out that I was trying to use it and built a simple pipeline (according to your documentation guidance) to apply to my data. Unfortunately, I am getting the following error, which appears after trying to fit the model to the data.

~\Documents\GitHub\MScMonkeyGaze\apply_glambox.py in run_glambox()
    201 
    202     # perform MCMC sampling
--> 203     model.fit()
    204     print('DEBUG | 4')
    205 

c:\users\boss\anaconda3\envs\env37\lib\site-packages\glambox\_model\model.py in fit(self, method, **kwargs)
    289             to GLAM model object
    290         """
--> 291         self.trace = fit_models(self.model, method=method, **kwargs)
    292         self.estimates = get_estimates(self)
    293 

c:\users\boss\anaconda3\envs\env37\lib\site-packages\glambox\_model\fit.py in fit_models(models, method, verbose, draws, n_vi, step, **kwargs)
     72                 else:
     73                     step_method = None
---> 74                 trace = pm.sample(draws=draws, step=step_method, **kwargs)
     75             elif method == 'VI':
     76                 vi_est = pm.fit(n=n_vi, **kwargs)

c:\users\boss\anaconda3\envs\env37\lib\site-packages\pymc3\sampling.py in sample(draws, step, init, n_init, start, trace, chain_idx, chains, cores, tune, progressbar, model, random_seed, discard_tuned_samples, compute_convergence_checks, **kwargs)
    435             _print_step_hierarchy(step)
    436             try:
--> 437                 trace = _mp_sample(**sample_args)
    438             except pickle.PickleError:
    439                 _log.warning("Could not pickle model, sampling singlethreaded.")

c:\users\boss\anaconda3\envs\env37\lib\site-packages\pymc3\sampling.py in _mp_sample(draws, tune, step, chains, cores, chain, random_seed, start, progressbar, trace, model, **kwargs)
    963     sampler = ps.ParallelSampler(
    964         draws, tune, chains, cores, random_seed, start, step,
--> 965         chain, progressbar)
    966     try:
    967         try:

c:\users\boss\anaconda3\envs\env37\lib\site-packages\pymc3\parallel_sampling.py in __init__(self, draws, tune, chains, cores, seeds, start_points, step_method, start_chain_num, progressbar)
    359                 draws, tune, step_method, chain + start_chain_num, seed, start
    360             )
--> 361             for chain, seed, start in zip(range(chains), seeds, start_points)
    362         ]
    363 

c:\users\boss\anaconda3\envs\env37\lib\site-packages\pymc3\parallel_sampling.py in <listcomp>(.0)
    359                 draws, tune, step_method, chain + start_chain_num, seed, start
    360             )
--> 361             for chain, seed, start in zip(range(chains), seeds, start_points)
    362         ]
    363 

c:\users\boss\anaconda3\envs\env37\lib\site-packages\pymc3\parallel_sampling.py in __init__(self, draws, tune, step_method, chain, seed, start)
    240         # We fork right away, so that the main process can start tqdm threads
    241         try:
--> 242             self._process.start()
    243         except IOError as e:
    244             # Something may have gone wrong during the fork / spawn

c:\users\boss\anaconda3\envs\env37\lib\multiprocessing\process.py in start(self)
    110                'daemonic processes are not allowed to have children'
    111         _cleanup()
--> 112         self._popen = self._Popen(self)
    113         self._sentinel = self._popen.sentinel
    114         # Avoid a refcycle if the target function holds an indirect

c:\users\boss\anaconda3\envs\env37\lib\multiprocessing\context.py in _Popen(process_obj)
    221     @staticmethod
    222     def _Popen(process_obj):
--> 223         return _default_context.get_context().Process._Popen(process_obj)
    224 
    225 class DefaultContext(BaseContext):

c:\users\boss\anaconda3\envs\env37\lib\multiprocessing\context.py in _Popen(process_obj)
    320         def _Popen(process_obj):
    321             from .popen_spawn_win32 import Popen
--> 322             return Popen(process_obj)
    323 
    324     class SpawnContext(BaseContext):

c:\users\boss\anaconda3\envs\env37\lib\multiprocessing\popen_spawn_win32.py in __init__(self, process_obj)
     87             try:
     88                 reduction.dump(prep_data, to_child)
---> 89                 reduction.dump(process_obj, to_child)
     90             finally:
     91                 set_spawning_popen(None)

c:\users\boss\anaconda3\envs\env37\lib\multiprocessing\reduction.py in dump(obj, file, protocol)
     58 def dump(obj, file, protocol=None):
     59     '''Replacement for pickle.dump() using ForkingPickler.'''
---> 60     ForkingPickler(file, protocol).dump(obj)
     61 
     62 #

AttributeError: Can't pickle local object 'make_subject_model.<locals>.lda_logp'

My code is identical to what you show in the "Quick Start" section, but with my data as input.

Thanks in advance for all the help!

athms commented 3 years ago

Hey, Thank you for your interest in GLAMbox. Could you provide us with more detail on the versions of python, pymc, and theano that you are using? You can find these with pip list. Thank you! Armin

pdmadeira commented 3 years ago

Thanks for the answer! The versions are:

python: 3.7.11
pymc3: 3.7
Theano: 1.0.4

moltaire commented 3 years ago

Hi @pdmadeira! Thank you for reporting this and apologies for the delay. These pickling errors are sometimes associated with running the model on multiple cores. Could you try running the fitting with setting the cores to 1 (model.fit(cores=1)) and see if this helps?

pdmadeira commented 3 years ago

Hey @moltaire, Thanks for your suggestion! The algorithm is running fine with the single-core model fitting. It just takes a lot of time - it is running for 75 hours now. Tell me if that is expectable or not, or if I can adjust something to make it faster (if possible).

athms commented 3 years ago

Hey @pdmadeira -- to make sure that the original error is not specific to your dataset, could you try simulating data (as described here: https://glambox.readthedocs.io/en/latest/examples/Example_1_Individual_estimation.html) and seeing whether you get the same pickling error when fitting for these data with multiple cores? Regarding your other question, you can generally shorten the sampling time by reducing the number of MCMC tuning and draw steps during model fitting (see the documentation for model.fit); However, if you do this, make sure that your parameter traces converge within the number of steps that you sample (see, e.g., the convergence_check function in https://glambox.readthedocs.io/en/latest/examples/Example_1_Individual_estimation.html). Hope this helps!

pdmadeira commented 3 years ago

@athms thanks for your intel. Just tried the input data you use in Example_1. The error continues to occur with multi-core fitting, only working with the single-core approach.

athms commented 3 years ago

Hi @pdmadeira, Great! Thank you for running these tests for us. It is likely that this specific pickling error in multi-core fitting results from the combination of the given theano and pymc versions with windows. We are looking into this and hope to fix this soon. Thanks for raising this issue. Is everything else working fine for you?

pdmadeira commented 3 years ago

I'm exploring the examples and the tuning of the algorithm now, but if you could spend some advice... for a dataset including 2 subjects and 13k/14k samples regarding each one, which values do you suggest for the MCMC tune and draws parameters?

glamlab / glambox

Model fitting error: " Can't pickle local object 'make_subject_model.<locals>.lda_logp' " #26