Closed rishasurana closed 1 year ago
This should actually never happen, right? We are always cutting off the largest outliers. Since you are passing in the same reporting units, the only other place there is randomness is when we are creating the conformalization set. Can you check whether these sets are the same in both runs?
This should actually never happen, right? We are always cutting off the largest outliers. Since you are passing in the same reporting units, the only other place there is randomness is when we are creating the conformalization set. Can you check whether these sets are the same in both runs?
Just checked - the conformalization set is the same for the winsorized and non_winsorized runs.
I remember getting this error more often (every other run) when I had the winsorized limits at 1%. I just pushed a commit that reverts the previous changes, and now has the winsorized limits at 6% from 5%. After this change, there is no error regardless of how many times I run the test since we have cut off all the possible outliers for that particular test. 5.9% works well, but 5.5% eventually resulted in an error. Could this mean that there's an optimal winsorize limit depending on the dataset?
The error was that the test was comparing state_data bounds rather than unit_data bounds. Only unit_data is modified by winsorizing in the results handler.
I'm not sure I understand what the final conclusion/decision here is?
I'm not sure I understand what the final conclusion/decision here is?
lower_0.9_turnout and upper_0.9_turnout are only modified for unit_data and not state_data, by the results handler when we call add_unit_intervals(). The test was comparing these values for state_data, hence causing an issue. Changing the test to look at unit_data, which is actually impacted by the winsorizing, fixes the error.
I'm not sure I understand what the final conclusion/decision here is?
lower_0.9_turnout and upper_0.9_turnout are only modified for unit_data and not state_data, by the results handler when we call add_unit_intervals(). The test was comparing these values for state_data, hence causing an issue. Changing the test to look at unit_data, which is actually impacted by the winsorizing, fixes the error.
Ok cool! This might be a stupid question, but why would the state data change if the unit data is the same?
The bootstrap function is causing the randomness. We are repeating the calculation 10,000 times with random sampling, so we occasionally end up with samples that give smaller/larger results for both State and Unit data. Setting the random_state parameter to a prime integer solves this issue without being too invasive. The calculation is still done several times but, if random_state is an int, a new RandomState instance is used, seeded with that value, when generating resamples and ensuring they are consistent.
There are two other alternatives:
Also had to update pre-commit here for the F811 error (redefinition of unused name from line n) in a file that wasn't modified by this PR, (would pass locally), but was causing an error with github actions.
I verified this and have some example data I can share if you want to see it Lenny.
I'm not sure if having a set seed will cause any issues? I would think not but I could be missing something.
This makes sense, thank you for investigating @rishasurana! We want to set a seed in the test only.
@lennybronner Thanks! Requesting a re-review because I had to make some updates to test_client.py after merging the generalize parameter PR.
Description
In
test_client.py::test_winsorize_intervals
, the check to ensure that the intervals are <=/=> when winsorized compared to when not winsorized would fail occasionally (about once every 10 times) for a difference that is about 0.04% of the predicted data (ex 2441554.0 vs 2442544.0). This would affect the github actions tests for other PRs sometimes.Jira Ticket
https://arcpublishing.atlassian.net/browse/ELEX-2857
Test Steps
run test_client.py::test_winsorize_intervals