janezd / baycomp

MIT License
71 stars 15 forks source link

ValueError: could not broadcast input array from shape (2) into shape (3) #6

Open jorjasso opened 4 years ago

jorjasso commented 4 years ago

Hi,

I am running the hierarchical model version of two_on_multiple on two cross-validation matrices x and y. I am running the test many times without problems, but for some matrices I am having a value error.

Reproducing code:


from ast import literal_eval

x="""[[0.82080925 0.95953757 0.89595376 0.97109827 0.90116279]
 [0.90667808 0.91780822 0.90239726 0.89974293 0.9143102 ]
 [0.88695652 0.89565217 0.92982456 0.92982456 0.92105263]
 [0.84782609 0.86956522 0.7826087  0.84444444 0.91111111]
 [0.8        0.8        0.7        0.88333333 0.86666667]
 [0.92222222 0.92222222 0.93333333 0.95555556 0.94444444]
 [1.         0.96666667 0.9        0.96666667 0.93333333]
 [0.97369068 0.98180477 0.9771274  0.97909493 0.97442204]
 [1.         0.97777778 0.97777778 0.97777778 0.97777778]
 [0.99213287 0.98688811 0.98951049 0.99125874 0.98776224]
 [0.99388112 0.99344406 0.99562937 0.99606643 0.99519231]
 [0.93337731 0.91029024 0.94327177 0.88778878 0.88976898]
 [0.77       0.765      0.765      0.755      0.765     ]
 [0.75250836 0.77725753 0.79451138 0.74364123 0.76639893]
 [0.94782609 0.94891304 0.95271739 0.95541055 0.94453507]
 [0.95       0.96428571 0.97142857 0.9547619  0.96428571]
 [0.82222222 0.86666667 0.81818182 0.88636364 0.88636364]
 [0.81176471 0.90588235 0.89411765 0.81176471 0.82142857]
 [0.99361314 0.99635036 0.99178832 0.99726027 1.        ]
 [0.97368421 0.97974342 0.97501688 0.96623903 0.98109386]
 [0.77142857 0.8        0.79885057 0.82758621 0.7816092 ]
 [0.95804196 0.98601399 0.97902098 0.97902098 0.94366197]
 [0.95348837 0.95348837 0.95348837 0.97619048 1.        ]
 [0.72294372 0.64718615 0.69264069 0.72017354 0.72234273]
 [0.82025678 0.78601997 0.81597718 0.82881598 0.80571429]] """
y="""[[0.9132948  0.87283237 0.87861272 0.83815029 0.84883721]
 [0.89554795 0.8989726  0.89297945 0.88860326 0.90488432]
 [0.92173913 0.88695652 0.9122807  0.93859649 0.9122807 ]
 [0.91304348 0.82608696 0.89130435 0.84444444 0.93333333]
 [0.81666667 0.85       0.78333333 0.86666667 0.88333333]
 [0.94444444 0.95555556 0.91111111 0.91111111 0.93333333]
 [1.         0.96666667 0.9        0.96666667 0.93333333]
 [0.95279075 0.95918367 0.95696016 0.94564683 0.96261682]
 [0.98888889 0.98888889 0.97777778 0.98888889 0.98888889]
 [0.97902098 0.96853147 0.9798951  0.98164336 0.9798951 ]
 [0.99038462 0.98339161 0.98907343 0.9916958  0.9881993 ]
 [0.91358839 0.92612137 0.93139842 0.90825083 0.91881188]
 [0.775      0.805      0.85       0.755      0.81      ]
 [0.94715719 0.96120401 0.96385542 0.95046854 0.95448461]
 [0.92880435 0.93858696 0.93695652 0.94888526 0.94127243]
 [0.98571429 0.96666667 0.97619048 0.94285714 0.98571429]
 [0.82222222 0.93333333 0.90909091 0.79545455 0.86363636]
 [0.91764706 0.92941176 0.91764706 0.91764706 0.91666667]
 [0.97718978 0.97718978 0.96624088 0.97077626 0.9716895 ]
 [0.97300945 0.97366644 0.96556381 0.96961512 0.972316  ]
 [0.74857143 0.73714286 0.7183908  0.72413793 0.75287356]
 [0.94405594 0.97202797 0.97902098 0.95104895 0.95774648]
 [0.90697674 1.         1.         1.         1.        ]
 [0.77056277 0.71212121 0.73809524 0.72885033 0.7462039 ]
 [0.81740371 0.77746077 0.80884451 0.82738944 0.83571429]] """
x = re.sub(r"([^[])\s+([^]])", r"\1, \2", x)
y = re.sub(r"([^[])\s+([^]])", r"\1, \2", y)
x = np.array(literal_eval(x))
y = np.array(literal_eval(y))

print(x)
print(y)

probs= two_on_multiple(x, y, rope=0, plot=False, names=['x', 'y'])```

# Output

```ValueError                                Traceback (most recent call last)
<ipython-input-15-71d98cb2e0f9> in <module>
     60 print(y)
     61 
---> 62 probs= two_on_multiple(x, y, rope=0, plot=False, names=['x', 'y'])

~/anaconda3/envs/bayesian/lib/python3.7/site-packages/baycomp/multiple.py in two_on_multiple(x, y, rope, runs, names, plot, **kwargs)
    485     else:
    486         test = SignedRankTest
--> 487     return call_shortcut(test, x, y, rope, names=names, plot=plot, **kwargs)

~/anaconda3/envs/bayesian/lib/python3.7/site-packages/baycomp/utils.py in call_shortcut(test, x, y, rope, plot, names, *args, **kwargs)
     18 
     19 def call_shortcut(test, x, y, rope, *args, plot=False, names=None, **kwargs):
---> 20     sample = test(x, y, rope, *args, **kwargs)
     21     if plot:
     22         return sample.probs(), sample.plot(names)

~/anaconda3/envs/bayesian/lib/python3.7/site-packages/baycomp/multiple.py in __new__(cls, x, y, rope, nsamples, **kwargs)
    151 
    152     def __new__(cls, x, y, rope=0, *, nsamples=50000, **kwargs):
--> 153         return Posterior(cls.sample(x, y, rope, nsamples=nsamples, **kwargs))
    154 
    155     @classmethod

~/anaconda3/envs/bayesian/lib/python3.7/site-packages/baycomp/multiple.py in sample(cls, x, y, rope, runs, lower_alpha, upper_alpha, lower_beta, upper_beta, upper_sigma, chains, nsamples)
    443 
    444         rope, diff = scaled_data(x, y, rope)
--> 445         mu, stdh, nu = run_stan(diff)
    446         samples = np.empty((len(nu), 3))
    447         for mui, std, df, sample_row in zip(mu, stdh, nu, samples):

~/anaconda3/envs/bayesian/lib/python3.7/site-packages/baycomp/multiple.py in run_stan(diff)
    426 
    427         def run_stan(diff):
--> 428             stan_data = prepare_stan_data(diff)
    429 
    430             # check if the last pickled result can be reused

~/anaconda3/envs/bayesian/lib/python3.7/site-packages/baycomp/multiple.py in prepare_stan_data(diff)
    401                 if np.var(sample) == 0:
    402                     sample[:nscores_2] = np.random.uniform(-rope, rope, nscores_2)
--> 403                     sample[nscores_2:] = -sample[:nscores_2]
    404 
    405             std_within = np.mean(np.std(diff, axis=1))  # may be different from std_diff!

ValueError: could not broadcast input array from shape (2) into shape (3)```
jorjasso commented 4 years ago

I think it is related to a zero variance problem. adding a small value to index causing the problem will work? something like this:

def avoid_zero_var(x,y):
    diff = y - x
    idx=diff==0
    x[idx]=x[idx]+0.0000001
    return x
dpaetzel commented 2 years ago

I just investigated this. This problem seems to occur if there is zero variance in the second dimension diff and diff.shape[1] is uneven. In that case, the code meant to avert numerical problems fails because sample[:nscores_2] has length k whereas sample[nscores_2:] has length k+1.

The problematic code in multiple.py currently looks like this:

nscores_2 = nscores // 2
(…)
sample[:nscores_2] = _random_state.uniform(-rope, rope, nscores_2)
sample[nscores_2:] = -sample[:nscores_2]

I'm not entirely sure how to handle the situation if nscores is uneven as I can't right now find the statistical argument that motivated the currently-used heuristic of “fixing” the variance by sampling from a uniform. Maybe one of the original authors has an idea?

janezd commented 2 years ago

Thanks for finding this. @benavoli, @gcorani any suggestions? One option is to just set the middle element to 0, but I've no idea if this is OK (that is: more statistically broken than the existing solution).

dpaetzel commented 2 years ago

Also, shouldn't the mean of the sample be kept? Right now, it is discarded which may lead to significant distortions: E.g. consider one learning task where performance has zero variance but one of the algorithms under consideration performs well whereas the other does not—in that case, substituting the sample with a uniform sample from the rope seems rather inappropriate?

dpaetzel commented 2 years ago

Idea: Draw nscores // 2 samples from a shifted half-normal with a certain (minimal) variance and then mirror that at the mean of the original sample. If nscores is uneven, set the mean as the central element.

A rough draft:

import scipy.stats as sst

def fix(diff, scale=1e-5):
    n_instances, n_scores = diff.shape
    n_scores_2 = n_scores // 2
    for sample in diff:
        if np.var(sample) == 0:
            mean = np.mean(sample)
            if n_scores % 2 == 0:
                sample[:n_scores_2] = sst.halfnorm.rvs(
                    loc=mean,
                    scale=scale,
                    size=n_scores_2,
                    random_state=_random_state)
                sample[n_scores_2:] = mean - (sample[:n_scores_2] - mean)
            else:
                sample[:n_scores_2] = sst.halfnorm.rvs(
                    loc=mean,
                    scale=scale,
                    size=n_scores_2,
                    random_state=_random_state)
                sample[n_scores_2] = mean
                sample[n_scores_2 + 1:] = mean - (sample[:n_scores_2] - mean)
    return diff

The question is then how to choose the scale parameter sensibly. Any suggestions? Is this overall idea viable/better than the fix implemented right now? If you like this (and make some suggestion as to how to choose the scale parameter), I may create a PR.

janezd commented 2 years ago

@gcorani, thanks for answering previous comments. You're welcome to continue. :)

@dpaetzel, this library is an adaptation of (mostly) R and Julia code by @gcorani and @benavoli. I do not feel confident at changing anything (except, of course, inconsequential refactoring). Therefore, I actually wouldn't mind transferring this repository to more capable hands. I can stay on board as "Python consultant". :)

gcorani commented 2 years ago

@dpaetzel @janezd I do not have really enough time to understand in detail the comment about shifted half-normal or implemeting it . I guess the best is that @dpaetzel implements the modification in his own copy of the code. Perhaps sooner or later we could have a chat so that he shows us which kind of work he is doing with the Bayesian tests.

dpaetzel commented 2 years ago

Thank you both for responding!

I do not feel confident at changing anything (except, of course, inconsequential refactoring).

@janezd, I understand that all too well. :slightly_smiling_face: If this issue arises again some time and you need me to elaborate on the idea I described above, feel free to get in touch.

I guess the best is that @dpaetzel implements the modification in his own copy of the code.

@gcorani, OK, I guess, I'll do that.

which kind of work he is doing with the Bayesian tests.

To be honest, I just wanted to properly evaluate some ML experiments I performed (for which I did not use CV—hence the other issues I opened earlier). I then noticed this issue being open for quite some time and thought that it may be worth investigating it shortly. :slightly_smiling_face: