States division - Githubissues

Simulation-Decomposition / simdec-python

Sensitivity analysis using simulation decomposition

https://simdec.readthedocs.io

BSD 3-Clause "New" or "Revised" License

21 stars 0 forks source link

States division #24

Closed gnopik closed 3 months ago

gnopik commented 6 months ago

The default procedure does not return states with an equal amount of observations. The screenshot (tested in the dashboard) and the data are attached. case1_data.csv

tupui commented 5 months ago

I actually know what is happening: NaN...

If I load the dataset, then do the decomposition and on the bins fill NaN, then I get an equal count for all scenarios.

I need to dig more to understand why we have NaNs. I don't remember the details there.

I have the feeling binned_statistic_dd is not doing exactly what I think it is🤔 I know for a SciPy maintainer... 😅

Maybe I need to calculate the bins for each axis before instead. This way I am sure that the binning is done on the number of sample and not the values. Need to check that hypothesis 😮‍💨

gnopik commented 5 months ago

NaNs in bins - what do you mean, like this? This is the way to communicate that we want particular boundaries between states (==bins), and this case, just for the second & third input variables out of four. If the whole thing is not supplied, (at least in the matlab package), the state boundaries are defined automatically:

either by categories if 5 or less unique values, or
equal amount of observations (highlighted)

tupui commented 5 months ago

Yep we can provide bounds for the bins. I just thought that was the normal behavior. I have to check that in SciPy's code and do some poking around.

So worst case I can do as you do and construct my own bounds it's not hard 👍

tupui commented 5 months ago

For the NaNs I don't remember why we have them, need to check as well.

tupui commented 3 months ago

Should be fixed in a81bf18b9d2e756eb46bbc807f9159e679803c2a

gnopik commented 3 months ago

For the NaNs I don't remember why we have them, need to check as well.

Easier to discuss over a call.