Test failing randomly - Githubissues

Bachibouzouk commented 11 months ago

This test passes and fails randomly, which can be confusing

__________ TestRandSwitchOnWindow.test_coincidence_normality_on_peak ___________

self = <tests.test_switch_on.TestRandSwitchOnWindow object at 0x7f739cb05ff0>

    def test_coincidence_normality_on_peak(self):
        # Create an instance of the Appliance class with the desired parameters
        appliance = Appliance(user=None, number=10, fixed='no')

        # Generate a sample of 'coincidence' values
        sample_size = 30
        coincidence_sample = []
        for _ in range(sample_size):
            coincidence = appliance.calc_coincident_switch_on(inside_peak_window=True)
            coincidence_sample.append(coincidence)

        # Perform the Shapiro-Wilk test for normality
        _, p_value = stats.shapiro(coincidence_sample)

        # Assert that the p-value is greater than a chosen significance level
>       assert p_value > 0.05, "The 'coincidence' values are not normally distributed."
E       AssertionError: The 'coincidence' values are not normally distributed.
E       assert 0.021894052624702454 > 0.05

tests/test_switch_on.py:40: AssertionError

Bachibouzouk commented 11 months ago

I did look at the distribution for various appliance numbers with the following code

from ramp.core.core import Appliance
import matplotlib.pyplot as plt

for N in [5,10,20,30]:

    appliance = Appliance(user=None, number=N, fixed='no')

    # Generate a sample of 'coincidence' values
    sample_size = 10000
    coincidence_sample = []
    for _ in range(sample_size):
        coincidence = appliance.calc_coincident_switch_on(inside_peak_window=True)
        coincidence_sample.append(coincidence)

    plt.hist(coincidence_sample, bins=[i+1 for i in range(N)], label=f"{N}")

plt.xlabel("Coincidence number")
plt.ylabel("Number of occurences")
plt.title(f"Sample size={sample_size}")
plt.legend(title="Appliance number")
plt.show()

And here is the result graph

coincidences_vs_number

It seems to me that when the value of appliance number is too small the distribution is not normal at all and when it is larger, than there are two larger tails for coincidence values of 1 and appliance.number (in red a gaussian with the same parameters as provided in method calc_coincident_switch_on)

coincidence_number10

coincidence_number50

In this test the method calc_coincident_switch_on has been provided arguments such that the coincidence should be calculated the following way

coincidence = min(self.number, max(1, math.ceil(random.gauss(mu=(self.number * mu_peak + 0.5), sigma=(s_peak * self.number * mu_peak)))))

The call of min and max explains the tails at 1 and appliance.number in the distribution. What I could not get my head around was the fact the distribution was not gaussian for small values of appliance.number. I removed the math.ceil in the expression above and computed again, I now get gaussian distibutions also for small values of appliance.number

coincidences_vs_number

Was this behavior intended @FLomb ? If not, thank to @ClaudiaLSS test we could spot it :) !

Bachibouzouk commented 11 months ago

TestRandSwitchOnWindow::test_coincidence_normality_on_peak seems to also fail randomly

Bachibouzouk commented 7 months ago

https://doi.org/10.1145/1146238.1146263

Might be a good reference for stochastic testing, their repo is located https://github.com/UDST/urbansim

Bachibouzouk commented 7 months ago

@FLomb - a comment on https://github.com/RAMP-project/RAMP/issues/99#issuecomment-1802782163 would help me to address this failing test :)

FLomb commented 7 months ago

Hi,

This issue is related to something we dealt with quite some time ago (in PR #7). Unfortunately, most of the discussion with the user who proposed that PR happened via email. So I took the time to look back at those very lengthy email discussions to get my head around this issue. Here are my thoughts.

As you correctly noted, the calls of min and max explain the tails at 1 and appliance.number. Those are not ideal, and it would be nice to find a solution for them, but they are not unexpected.

The fact that the distribution is skewed to the right, especially when simulating few appliances, is not due to the math.ceil that needs to remain in place to ensure we only get integer numbers. Instead, it is due to the hard-coded +0.5 bit. This was initially introduced as part of the above-referenced past PR as a way to skew on-peak distributions to the right on purpose. Nonetheless, looking back at this with some perspective, and having made additional tests today, I think that was a mistake. In fact, the mu_peak parameter is what is supposed to allow users to skew the distribution more towards the right or the left (by making it larger or smaller than the default 0.5). The parameter works perfectly for this aim, and the +0.5 is unnecessary. If we agree that on-peak switch-on events shouldn't be perfectly Gaussian by default, but slightly more in favour of a higher number of appliances switched-on coincidentally, we should just modify the default value for this parameter, not add a +0.5 to the code.

Regarding the test, all of the above means the following:

The test should know that the tails are there and not complain (until we improve on that front)
The test should also be ready to accommodate users who decide they do not want a normal distribution after all, i.e. users that want to use mu_peak to skew the distribution towards the right or the left, for instance, to simulate extreme scenarios for system robustness.

In addition, regardless of what we do with this test, I propose we scrap the +0.5 parameter.

FLomb commented 6 months ago

HI @Bachibouzouk, thanks for the update! As a side note, was it on purpose that you edited my comment with your updates instead of replying to it? I find it less straightforward to follow the discussion this way.

Anyhow, I tried replicating your original code, but I couldn't, due to an AttributeError in the appliance. So I've kept working with the code I had made for my own testing, which bypasses the appliance and just tests the coincidence functions you have pasted above. What I get is different from what you do; in my case, removing the +0.5 factor leds to perfectly Gaussian results (see below)

Adding the -0.5 that you suggest, instead, visibly skews the distribution towards the left (see below).

Bachibouzouk commented 6 months ago

As a side note, was it on purpose that you edited my comment with your updates instead of replying to it?

No this was a manipulation mistake, sorry for that! Here is my message (now not in the right order of the timeline)

@FLomb - I think actually the 0.5 factor is needed but not at the same place

The following picture is the distribution where I removed the +0.5 factor coincidence_number10_correction0

One can see that the distribution is shifted to the right compared to gaussian distribution. If I add a correction shift of -0.5 within the ceil function, then we have a better match between the distribution and the gaussian distribution

coincidence_number10_correction-0 5

So I suggest we change the current code:

            coincidence = min(
                self.number,
                max(
                    1,
                    math.ceil(
                        random.gauss(
                            mu=(self.number * mu_peak + 0.5),
                            sigma=(s_peak * self.number * mu_peak),
                        )
                    ),
                ),
            )

to

            coincidence = min(
                self.number,
                max(
                    1,
                    math.ceil(
                        random.gauss(
                            mu=(self.number * mu_peak),
                            sigma=(s_peak * self.number * mu_peak),
                        ) - 0.5
                    ),
                ),
            )

Bachibouzouk commented 6 months ago

I tried replicating your original code, but I couldn't, due to an AttributeError in the appliance

This is peculiar, which branch were you on? Maybe we can have a bilateral call on this?

RAMP-project / RAMP

Test failing randomly #99