Open Bachibouzouk opened 11 months ago
I did look at the distribution for various appliance numbers with the following code
from ramp.core.core import Appliance
import matplotlib.pyplot as plt
for N in [5,10,20,30]:
appliance = Appliance(user=None, number=N, fixed='no')
# Generate a sample of 'coincidence' values
sample_size = 10000
coincidence_sample = []
for _ in range(sample_size):
coincidence = appliance.calc_coincident_switch_on(inside_peak_window=True)
coincidence_sample.append(coincidence)
plt.hist(coincidence_sample, bins=[i+1 for i in range(N)], label=f"{N}")
plt.xlabel("Coincidence number")
plt.ylabel("Number of occurences")
plt.title(f"Sample size={sample_size}")
plt.legend(title="Appliance number")
plt.show()
And here is the result graph
It seems to me that when the value of appliance number is too small the distribution is not normal at all and when it is larger, than there are two larger tails for coincidence values of 1 and appliance.number
(in red a gaussian with the same parameters as provided in method calc_coincident_switch_on
)
In this test the method calc_coincident_switch_on
has been provided arguments such that the coincidence should be calculated the following way
coincidence = min(self.number, max(1, math.ceil(random.gauss(mu=(self.number * mu_peak + 0.5), sigma=(s_peak * self.number * mu_peak)))))
The call of min and max explains the tails at 1 and appliance.number
in the distribution. What I could not get my head around was the fact the distribution was not gaussian for small values of appliance.number
. I removed the math.ceil
in the expression above and computed again, I now get gaussian distibutions also for small values of appliance.number
Was this behavior intended @FLomb ? If not, thank to @ClaudiaLSS test we could spot it :) !
TestRandSwitchOnWindow::test_coincidence_normality_on_peak
seems to also fail randomly
https://doi.org/10.1145/1146238.1146263
Might be a good reference for stochastic testing, their repo is located https://github.com/UDST/urbansim
@FLomb - a comment on https://github.com/RAMP-project/RAMP/issues/99#issuecomment-1802782163 would help me to address this failing test :)
Hi,
This issue is related to something we dealt with quite some time ago (in PR #7). Unfortunately, most of the discussion with the user who proposed that PR happened via email. So I took the time to look back at those very lengthy email discussions to get my head around this issue. Here are my thoughts.
As you correctly noted, the calls of min and max explain the tails at 1 and appliance.number. Those are not ideal, and it would be nice to find a solution for them, but they are not unexpected.
The fact that the distribution is skewed to the right, especially when simulating few appliances, is not due to the math.ceil that needs to remain in place to ensure we only get integer numbers. Instead, it is due to the hard-coded +0.5 bit. This was initially introduced as part of the above-referenced past PR as a way to skew on-peak distributions to the right on purpose. Nonetheless, looking back at this with some perspective, and having made additional tests today, I think that was a mistake. In fact, the mu_peak parameter is what is supposed to allow users to skew the distribution more towards the right or the left (by making it larger or smaller than the default 0.5). The parameter works perfectly for this aim, and the +0.5 is unnecessary. If we agree that on-peak switch-on events shouldn't be perfectly Gaussian by default, but slightly more in favour of a higher number of appliances switched-on coincidentally, we should just modify the default value for this parameter, not add a +0.5 to the code.
Regarding the test, all of the above means the following:
In addition, regardless of what we do with this test, I propose we scrap the +0.5 parameter.
HI @Bachibouzouk, thanks for the update! As a side note, was it on purpose that you edited my comment with your updates instead of replying to it? I find it less straightforward to follow the discussion this way.
Anyhow, I tried replicating your original code, but I couldn't, due to an AttributeError
in the appliance
. So I've kept working with the code I had made for my own testing, which bypasses the appliance
and just tests the coincidence
functions you have pasted above. What I get is different from what you do; in my case, removing the +0.5
factor leds to perfectly Gaussian results (see below)
Adding the -0.5
that you suggest, instead, visibly skews the distribution towards the left (see below).
As a side note, was it on purpose that you edited my comment with your updates instead of replying to it?
No this was a manipulation mistake, sorry for that! Here is my message (now not in the right order of the timeline)
@FLomb - I think actually the 0.5 factor is needed but not at the same place
The following picture is the distribution where I removed the +0.5 factor
One can see that the distribution is shifted to the right compared to gaussian distribution. If I add a correction shift of -0.5 within the ceil
function, then we have a better match between the distribution and the gaussian distribution
So I suggest we change the current code:
coincidence = min(
self.number,
max(
1,
math.ceil(
random.gauss(
mu=(self.number * mu_peak + 0.5),
sigma=(s_peak * self.number * mu_peak),
)
),
),
)
to
coincidence = min(
self.number,
max(
1,
math.ceil(
random.gauss(
mu=(self.number * mu_peak),
sigma=(s_peak * self.number * mu_peak),
) - 0.5
),
),
)
I tried replicating your original code, but I couldn't, due to an
AttributeError
in theappliance
This is peculiar, which branch were you on? Maybe we can have a bilateral call on this?
This test passes and fails randomly, which can be confusing