gelles-brandeis / tapqir

Bayesian analysis of co-localization single-molecule microscopy image data.
Apache License 2.0
4 stars 0 forks source link

COSMOS+HMM errors #354

Closed jc-brandeis closed 2 years ago

jc-brandeis commented 2 years ago

Hi Yerdos,

I still get errors when trying to fit data using Cosmos+HMM:

Tapqir model

cosmos+hmm Run computations on GPU? AOI batch size 10 Frame batch size 512 Learning rate 0.005 Number of iterations 0 Save parameters in matlab format? Priors background_mean_std 1000 background_std_std 100 lamda_rate 1 height_std 10000 width_min 0.75 width_max 2.25 proximity_rate 1 gain_std 50 Fitting the data ... 0% 0/100000 [00:00<?, ?it/s]

TypeError Traceback (most recent call last) File ~/anaconda3/envs/tapqir-env/lib/python3.8/site-packages/tapqir/gui.py:502, in fitCmd(b, layout, out, DEFAULTS) 500 DEFAULTS["priors"].update(layout["priors"].children[0].kwargs) 501 with out: --> 502 fit( 503 **layout.kwargs, 504 k_max=2, 505 funsor=False, 506 pykeops=True, 507 no_input=True, 508 progress_bar=tqdm_notebook, 509 ) 511 out.clear_output(wait=True)

File ~/anaconda3/envs/tapqir-env/lib/python3.8/site-packages/tapqir/main.py:462, in fit(model, cuda, nbatch_size, fbatch_size, learning_rate, num_iter, k_max, matlab, funsor, pykeops, overwrite, no_input, progress_bar) 460 model.init(learning_rate, nbatch_size, fbatch_size) 461 try: --> 462 model.run(num_iter, progress_bar=progress_bar) 463 except CudaOutOfMemoryError: 464 logger.exception("Failed to fit the data")

File ~/anaconda3/envs/tapqir-env/lib/python3.8/site-packages/tapqir/models/model.py:220, in Model.run(self, num_iter, progress_bar) 218 for i in progress_bar(range(num_iter)): 219 try: --> 220 self.iter_loss = self.svi.step() 221 # save a checkpoint every 200 iterations 222 if not self.iter % 200:

File ~/anaconda3/envs/tapqir-env/lib/python3.8/site-packages/pyro/infer/svi.py:145, in SVI.step(self, *args, *kwargs) 143 # get loss and compute gradients 144 with poutine.trace(param_only=True) as param_capture: --> 145 loss = self.loss_and_grads(self.model, self.guide, args, **kwargs) 147 params = set( 148 site["value"].unconstrained() for site in param_capture.trace.nodes.values() 149 ) 151 # actually perform gradient steps 152 # torch.optim objects gets instantiated for any params that haven't been seen yet

File ~/anaconda3/envs/tapqir-env/lib/python3.8/site-packages/pyro/contrib/funsor/infer/elbo.py:20, in ELBO.loss_and_grads(self, model, guide, *args, kwargs) 19 def loss_and_grads(self, model, guide, *args, *kwargs): ---> 20 loss = self.differentiable_loss(model, guide, args, kwargs) 21 loss.backward() 22 return loss.item()

File ~/anaconda3/envs/tapqir-env/lib/python3.8/site-packages/pyro/contrib/funsor/infer/traceenum_elbo.py:104, in TraceMarkovEnum_ELBO.differentiable_loss(self, model, guide, *args, kwargs) 94 def differentiable_loss(self, model, guide, *args, *kwargs): 95 96 # get batched, enumerated, to_funsor-ed traces from the guide and model 97 with plate( 98 size=self.num_particles 99 ) if self.num_particles > 1 else contextlib.ExitStack(), enum( (...) 102 else None 103 ): --> 104 guide_tr = trace(guide).get_trace(args, kwargs) 105 model_tr = trace(replay(model, trace=guide_tr)).get_trace(*args, **kwargs) 107 # extract from traces all metadata that we will need to compute the elbo

File ~/anaconda3/envs/tapqir-env/lib/python3.8/site-packages/pyro/poutine/trace_messenger.py:198, in TraceHandler.get_trace(self, *args, kwargs) 190 def get_trace(self, *args, *kwargs): 191 """ 192 :returns: data structure 193 :rtype: pyro.poutine.Trace (...) 196 Calls this poutine and returns its trace instead of the function's return value. 197 """ --> 198 self(args, kwargs) 199 return self.msngr.get_trace()

File ~/anaconda3/envs/tapqir-env/lib/python3.8/site-packages/pyro/poutine/trace_messenger.py:174, in TraceHandler.call(self, *args, *kwargs) 170 self.msngr.trace.add_node( 171 "_INPUT", name="_INPUT", type="args", args=args, kwargs=kwargs 172 ) 173 try: --> 174 ret = self.fn(args, **kwargs) 175 except (ValueError, RuntimeError) as e: 176 exc_type, exc_value, traceback = sys.exc_info()

File ~/anaconda3/envs/tapqir-env/lib/python3.8/site-packages/tapqir/models/hmm.py:336, in hmm.guide(self) 334 for fdx in frames: 335 if self.vectorized: --> 336 fsx, fdx = fdx 337 fdx = torch.as_tensor(fdx) 338 fdx = fdx.unsqueeze(-1)

TypeError: cannot unpack non-iterable int object

ordabayevy commented 2 years ago

Hi @jc-brandeis. Can you try upgrading to release v1.1.4 and see if that fixes your issue?

jc-brandeis commented 2 years ago

v1.1.4 has different errors when running cosmos-hmm:

---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
File ~/anaconda3/envs/tapqir-env/lib/python3.8/site-packages/tapqir/gui.py:502, in fitCmd(b, layout, out, DEFAULTS)
    500 DEFAULTS["priors"].update(layout["priors"].children[0].kwargs)
    501 with out:
--> 502     fit(
    503         **layout.kwargs,
    504         k_max=2,
    505         funsor=False,
    506         pykeops=True,
    507         no_input=True,
    508         progress_bar=tqdm_notebook,
    509     )
    511 out.clear_output(wait=True)

File ~/anaconda3/envs/tapqir-env/lib/python3.8/site-packages/tapqir/main.py:411, in fit(model, cuda, nbatch_size, fbatch_size, learning_rate, num_iter, k_max, matlab, funsor, pykeops, overwrite, no_input, progress_bar)
    407     progress_bar = tqdm
    409 from pyroapi import pyro_backend
--> 411 from tapqir.models import models
    413 logger = logging.getLogger("tapqir")
    415 settings = {}

File ~/anaconda3/envs/tapqir-env/lib/python3.8/site-packages/tapqir/models/__init__.py:6, in <module>
      4 from tapqir.models.cosmos import cosmos
      5 from tapqir.models.crosstalk import crosstalk
----> 6 from tapqir.models.hmm import hmm
      7 from tapqir.models.model import Model
      9 __all__ = [
     10     "models",
     11     "Model",
   (...)
     14     "hmm",
     15 ]

File ~/anaconda3/envs/tapqir-env/lib/python3.8/site-packages/tapqir/models/hmm.py:24, in <module>
     22 from tapqir.distributions.util import expand_offtarget, probs_m, probs_theta
     23 from tapqir.handlers import trace, vectorized_markov
---> 24 from tapqir.infer.elbo import TraceMarkovEnum_ELBO
     25 from tapqir.models.cosmos import cosmos
     28 class hmm(cosmos):

ModuleNotFoundError: No module named 'tapqir.infer'
ordabayevy commented 2 years ago

Can you try the newest version v1.1.5?

jc-brandeis commented 2 years ago

v1.1.5 seems to be working with the following warning:

/home/jchung/anaconda3/envs/tapqir-env/lib/python3.8/site-packages/torch/distributions/gamma.py:71: UserWarning: Specified kernel cache directory could not be created! This disables kernel caching. Specified directory is /home/jchung/.cache/torch/kernels. This warning will appear only once per process. (Triggered internally at  ../aten/src/ATen/native/cuda/jit_utils.cpp:860.)
  self.rate * value - torch.lgamma(self.concentration))
ordabayevy commented 2 years ago

You can ignore this warning. I'm not sure how to suppress it.