lnccbrown / HSSM

Development of HSSM package
Other
82 stars 11 forks source link

why hssm.HSSM() model can't handle response variable with range 0 and 1 #539

Closed Hellobamboobamboo closed 3 months ago

Hellobamboobamboo commented 3 months ago

Describe the bug A clear and concise description of what the bug is.

HSSM version 0.2.3

I was running my model like this: basic_model = hssm.HSSM( 7 model="ddm", # Specify the model type as drift diffusion model (DDM) 8 data=data, 9 include=[ 10 { 11 "name": "v", 12 "formula": "v ~ current + diff + condition + status + gender", 13 "prior": { 14 "Intercept": {"name": "Normal", "mu": 0.0, "sigma": 1.0}, 15 "current": {"name": "Normal", "mu": 0.0, "sigma": 1.0}, 16 "diff": {"name": "Normal", "mu": 0.0, "sigma": 1.0}, 17 "condition": {"name": "Normal", "mu": 0.0, "sigma": 1.0}, 18 "status": {"name": "Normal", "mu": 0.0, "sigma": 1.0}, 19 "gender": {"name": "Normal", "mu": 0.0, "sigma": 1.0}, 20 }, 21 "link": "identity", 22 }, 23 { 24 "name": "a", 25 "formula": "a ~ current + diff + condition + status + gender", 26 "prior": { 27 "Intercept": {"name": "Normal", "mu": 0.5, "sigma": 1.0}, 28 "current": {"name": "Normal", "mu": 0.0, "sigma": 1.0}, 29 "diff": {"name": "Normal", "mu": 0.0, "sigma": 1.0}, 30 "condition": {"name": "Normal", "mu": 0.0, "sigma": 1.0}, 31 "status": {"name": "Normal", "mu": 0.0, "sigma": 1.0}, 32 "gender": {"name": "Normal", "mu": 0.0, "sigma": 1.0}, 33 }, 34 "link": "identity", 35 } 36 ], 37 )

then I get a error like this:

File ~\AppData\Local\anaconda3\envs\hssm_env\Lib\site-packages\hssm\hssm.py:383, in HSSM.init(self, data, model, choices, include, model_config, loglik, loglik_kind, p_outlier, lapse, hierarchical, link_settings, prior_settings, extra_namespace, missing_data, deadline, loglik_missing_data, process_initvals, **kwargs) 380 self.p_outlier = self.params.get("p_outlier") 381 self.lapse = lapse if self.has_lapse else None --> 383 self._post_check_data_sanity() 385 self.model_distribution = self._make_model_distribution() 387 self.family = make_family( 388 self.model_distribution, 389 self.list_params, 390 self.link, 391 self._parent, 392 )

File ~\AppData\Local\anaconda3\envs\hssm_env\Lib\site-packages\hssm\hssm.py:1683, in HSSM._post_check_data_sanity(self) 1679 if np.any(~np.isin(unique_responses, self.choices)): 1680 invalid_responses = sorted( 1681 unique_responses[~np.isin(unique_responses, self.choices)] 1682 ) -> 1683 raise ValueError( 1684 f"Invalid responses found in your dataset: {invalid_responses}" 1685 ) 1687 if len(unique_responses) != self.n_choices: 1688 missing_responses = sorted(np.setdiff1d(self.choices, unique_responses))

ValueError: Invalid responses found in your dataset: [0]

what should my response variable's value be, it's usually just 0 and 1? Also, do I still need to flip the rt time to negative manually like in hddm to plotting? But HSSM doesn't take negative as valid this time. What values can it take?

digicosmos86 commented 3 months ago

Hi @Hellobamboobamboo

As a convention we only accept responses coded as -1 and 1 for the binary case, to differentiate with cases with more than two responses, where we accept 0, 1, 2, and so on. Once the coding is changed I think you'll be able to proceed to plotting

I am also going to move this to discussions since this is not a bug