Open philastrophist opened 5 years ago
The cutoff
argument is meant to allow for speed-ups in cases of many components. It is unlikely that you will need all components for every sample, so setting e.g. cutoff=3
doesn't even attempt to fit samples outside of the 3-sigma region of a component. This works very well for data that are spread out a lot, and it also helps break degeneracies for many strongly overlapping components.
I realize that I should document this parameter better, you're not the first person to ask.
Ah ok that makes sense, cutoff=None
raises errors though, so I guess for now it's easier to just set cutoff=inf
for my purposes.
There shouldn't be errors with cutoff=None
. Can you post the error and the traceback, please.
Its an attribute error, trying to copy a None
ITER SAMPLES LOG_L STABLE
0 5000 -2.383 3
Traceback (most recent call last):
File "/local/home/sread/Apps/anaconda/envs/pymc3-uptodate/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 2963, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-2-f61dddb7d343>", line 4, in <module>
runfile('/local/home/sread/Dropbox/pygmmis/models.py', wdir='/local/home/sread/Dropbox/pygmmis')
File "/local/home/sread/Apps/jetbrains-toolbox-1.4.2492/install_location/apps/PyCharm-P/ch-0/182.4129.5/helpers/pydev/_pydev_bundle/pydev_umd.py", line 197, in runfile
pydev_imports.execfile(filename, global_vars, local_vars) # execute the script
File "/local/home/sread/Apps/jetbrains-toolbox-1.4.2492/install_location/apps/PyCharm-P/ch-0/182.4129.5/helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile
exec(compile(contents+"\n", file, 'exec'), glob, loc)
File "/local/home/sread/Dropbox/pygmmis/models.py", line 122, in <module>
split_n_merge=gmm.K * (gmm.K - 1) * (gmm.K - 2) / 2)
File "/local/home/sread/Dropbox/pygmmis/pygmmis.py", line 689, in fit
U_ = [U[k].copy() for k in xrange(gmm.K)]
File "/local/home/sread/Dropbox/pygmmis/pygmmis.py", line 689, in <listcomp>
U_ = [U[k].copy() for k in xrange(gmm.K)]
AttributeError: 'NoneType' object has no attribute 'copy'
Can you post the call of pygmmis.fit
as well please.
Sure. It is here:
logL, U = pygmmis.fit(gmm, data, init_method='kmeans', w=0.01, cutoff=None, tol=1e-6, rng=rng, maxiter=1)
In performing some tests of pygmmis I have found that varying the
cutoff
argument drastically changes the end result of fitting even with split-and-merge turned on (and exhaustive).My understanding of EM is that the responsibilities
r_ik
are calculated for all data and all components. Why then, doespygmmis
use a cutoff to fit only to those data in the neighbourhood of each component? As far as I can understand,cutoff!=inf
simply means that it will be labelling some data as not belonging to any component.Is the reason something to do with the background or is it just to avoid outliers?
Thanks
P.S. This code is very cool!