hddm-devs / hddm

HDDM is a python module that implements Hierarchical Bayesian parameter estimation of Drift Diffusion Models (via PyMC).
http://ski.clps.brown.edu/hddm_docs/
Other
260 stars 117 forks source link

HDDMRegression.approx_map() throws key error #34

Closed twiecki closed 10 years ago

twiecki commented 11 years ago

When I tested it with a within-subject model (hddmRegressor) I got the optimisation results but then this error:

Traceback (most recent call last):
  File "<pyshell#55>", line 1, in <module>
    m.find_starting_values()
  File "/Library/Python/2.7/site-packages/kabuki/hierarchical.py", line 775, in find_starting_values
    self.approximate_map()
  File "/Library/Python/2.7/site-packages/kabuki/hierarchical.py", line 835, in approximate_map
    for name, value in self.values.iteritems():
  File "/Library/Python/2.7/site-packages/kabuki/hierarchical.py", line 759, in values
    return {name: node['node'].value[()] for (name, node) in self.iter_non_observeds()}
  File "/Library/Python/2.7/site-packages/kabuki/hierarchical.py", line 759, in <dictcomp>
    return {name: node['node'].value[()] for (name, node) in self.iter_non_observeds()}
  File "/Library/Python/2.7/site-packages/pandas/core/frame.py", line 1928, in __getitem__
    return self._get_item_cache(key)
  File "/Library/Python/2.7/site-packages/pandas/core/generic.py", line 570, in _get_item_cache
    values = self._data.get(item)
  File "/Library/Python/2.7/site-packages/pandas/core/internals.py", line 1383, in get
    _, block = self._find_block(item)
  File "/Library/Python/2.7/site-packages/pandas/core/internals.py", line 1525, in _find_block
    self._check_have(item)
  File "/Library/Python/2.7/site-packages/pandas/core/internals.py", line 1532, in _check_have
    raise KeyError('no item named %s' % com.pprint_thing(item))
KeyError: u'no item named ()'
AngelosPsy commented 11 years ago

Hi Thomas,

Was looking at the issue again, after updating to the developers version of kabuki and hddm. It appears that in case of the Regression Model, the attribute 'value' is not properly defined -- thus it is not a problem of the .approx_map() per se. Specifically, for a regression model:

m = hddm.HDDMRegressor(data, "v ~ C(coninc, Treatment('TestValue'))", group_only_regressors=False)

the attribute 'values' is printed when I typed in

dir(m)

but a similar error appeared as before when I typed in

m.values()

Traceback (most recent call last): File "", line 1, in File "/usr/local/lib/python2.7/dist-packages/kabuki/hierarchical.py", line 759, in values return {name: node['node'].value[()] for (name, node) in self.iter_non_observeds()} File "/usr/local/lib/python2.7/dist-packages/kabuki/hierarchical.py", line 759, in return {name: node['node'].value[()] for (name, node) in self.iter_non_observeds()} File "/usr/local/lib/python2.7/dist-packages/pandas/core/frame.py", line 2003, in getitem return self._get_item_cache(key) File "/usr/local/lib/python2.7/dist-packages/pandas/core/generic.py", line 667, in _get_item_cache values = self.data.get(item) File "/usr/local/lib/python2.7/dist-packages/pandas/core/internals.py", line 1655, in get , block = self._find_block(item) File "/usr/local/lib/python2.7/dist-packages/pandas/core/internals.py", line 1935, in _find_block self._check_have(item) File "/usr/local/lib/python2.7/dist-packages/pandas/core/internals.py", line 1942, in _check_have raise KeyError('no item named %s' % com.pprint_thing(item)) KeyError: u'no item named ()'

Is there any chance that the slicing of 'values' is not properly defined? My apologies for not being much of a help but I am struggling to find a solution and not sure if this is a problem with my code or a bug.

Best, Angelos

twiecki commented 11 years ago

Hi,

Sorry, not sure what you are trying to achieve. Why are you calling .value()?

Thomas

On Mon, Sep 16, 2013 at 1:48 PM, AngelosPsy notifications@github.comwrote:

Hi Thomas,

Was looking at the issue again, after updating to the developers version of kabuki and hddm. It appears that in case of the Regression Model, the attribute 'value' is not properly defined -- thus it is not a problem of the .approx_map() per se. Specifically, for a regression model:

m = hddm.HDDMRegressor(data, "v ~ C(coninc, Treatment('TestValue'))", group_only_regressors=False)

the attribute 'values' is printed when I typed in

dir(m)

but a similar error appeared as before when I typed in

m.values()

Traceback (most recent call last): File "", line 1, in File "/usr/local/lib/python2.7/dist-packages/kabuki/hierarchical.py", line 759, in values return {name: node['node'].value[()] for (name, node) in self.iter_non_observeds()} File "/usr/local/lib/python2.7/dist-packages/kabuki/hierarchical.py", line 759, in return {name: node['node'].value[()] for (name, node) in self.iter_non_observeds()} File "/usr/local/lib/python2.7/dist-packages/pandas/core/frame.py", line 2003, in getitem return self._get_item_cache(key) File "/usr/local/lib/python2.7/dist-packages/pandas/core/generic.py", line 667, in _get_item_cache values = self.data.get(item) File "/usr/local/lib/python2.7/dist-packages/pandas/core/internals.py", line 1655, in get , block = self._find_block(item) File "/usr/local/lib/python2.7/dist-packages/pandas/core/internals.py", line 1935, in _find_block self._check_have(item) File "/usr/local/lib/python2.7/dist-packages/pandas/core/internals.py", line 1942, in _check_have raise KeyError('no item named %s' % com.pprint_thing(item)) KeyError: u'no item named ()'

Is there any chance that the slicing of 'values' is not properly defined? My apologies for not being much of a help but I am struggling to find a solution and not sure if this is a problem with my code or a bug.

Best, Angelos

— Reply to this email directly or view it on GitHubhttps://github.com/hddm-devs/hddm/issues/34#issuecomment-24529628 .

Thomas Wiecki PhD candidate, Brown University Quantitative Researcher, Quantopian Inc, Boston

AngelosPsy commented 11 years ago

Hi and apologies for being so unclear. Hope this message is a bit better. My initial goal was to run multiple chains with different starting values. That is why I wanted to call the find_starting_values(), which uses approximate_map(), in order to find good starting values. It appeared that it worked fine, but it started choking in the end, and the error above happened.

That is why I went to the definition of .approximate_map()

    def approximate_map(self, fall_to_simplex = True):
        m = pm.MCMC(self.nodes_db.node)
        generations = m.generations
        generations.append(self.get_observeds().node)

        for i in range(len(generations)-1, 0, -1):
            # Optimize the generation at i-1 evaluated over the generation at i
            self._partial_optimize(generations[i-1], generations[i], fall_to_simplex)

        #update map in nodes_db
        self.nodes_db['map'] = np.NaN
        for name, value in self.values.iteritems():
            try:
                self.nodes_db['map'].ix[name] = value
            # Some values can be series which we'll just ignore
            except (AttributeError, ValueError):
                pass

and it appeared that the problem in the code is when the for loop starts:

 for name, value in self.values.iteritems(): # HERE

That is because somehow the .values -- and subsequently the values.iteritems -- property does not work for the regression model. Went to the values() definition but could not figure out what was wrong there but thought maybe I could point to a direction to a solution?

Hope this is a bit clearer and potentially helps. If not, I am truly sorry for the confusion...

Best,

Angelos

twiecki commented 11 years ago

Hi,

What's the error that .find_starting_values() raises?

Also, to try if .values produces an error you call it like an attribute, not a function. I.e. m.values -- what does that produce?

Thomas

On Tue, Sep 17, 2013 at 4:44 AM, AngelosPsy notifications@github.comwrote:

Hi and apologies for being so unclear. Hope this message is a bit better. My initial goal was to run multiple chains with different starting values. That is why I wanted to call the find_starting_values(), which uses approximate_map(), in order to find good starting values. It appeared that it worked fine, but it started choking in the end, and the error above happened.

That is why I went to the definition of .approximate_map()

def approximate_map(self, fall_to_simplex = True):
    m = pm.MCMC(self.nodes_db.node)
    generations = m.generations
    generations.append(self.get_observeds().node)

    for i in range(len(generations)-1, 0, -1):
        # Optimize the generation at i-1 evaluated over the generation at i
        self._partial_optimize(generations[i-1], generations[i], fall_to_simplex)

    #update map in nodes_db
    self.nodes_db['map'] = np.NaN
    for name, value in self.values.iteritems():
        try:
            self.nodes_db['map'].ix[name] = value
        # Some values can be series which we'll just ignore
        except (AttributeError, ValueError):
            pass

and it appeared that the problem in the code is when the for loop starts:

for name, value in self.values.iteritems(): # HERE

That is because somehow the .values -- and subsequently the values.iteritems -- property does not work for the regression model. Went to the values() definition but could not figure out what was wrong there but thought maybe I could point to a direction to a solution?

Hope this is a bit clearer and potentially helps. If not, I am truly sorry for the confusion...

Best,

Angelos

— Reply to this email directly or view it on GitHubhttps://github.com/hddm-devs/hddm/issues/34#issuecomment-24572482 .

Thomas Wiecki PhD candidate, Brown University Quantitative Researcher, Quantopian Inc, Boston

AngelosPsy commented 10 years ago

Hi Thomas,

My apologies for this late reply. I was struggling with updating the HDDM version in the work computers -- my request had to go via the computer administrators and that caused time -- as I realized that I was running an older version. I updated to the most recent version, both this and kabuki, and everything seems to be fine now. Thus, I think that the whole issue is now solved. My apologies again.

Best,

Angelos

twiecki commented 10 years ago

Good to hear it's working. Enjoy the modeling :).

Thomas

On Fri, Sep 20, 2013 at 2:53 AM, AngelosPsy notifications@github.comwrote:

Hi Thomas,

My apologies for this late reply. I was struggling with updating the HDDM version in the work computers -- my request had to go via the computer administrators and that caused time -- as I realized that I was running an older version. I updated to the most recent version, both this and kabuki, and everything seems to be fine now. Thus, I think that the whole issue is now solved. My apologies again.

Best,

Angelos

— Reply to this email directly or view it on GitHubhttps://github.com/hddm-devs/hddm/issues/34#issuecomment-24792911 .

Thomas Wiecki PhD candidate, Brown University Quantitative Researcher, Quantopian Inc, Boston