Closed heyitsvic closed 5 years ago
Hmm. Slow initialisation is excluded from that measurement, so I think the problem really is that drawing an example is very slow.
Looking at the profile (very very helpful, thanks!), it's spending a lot of time in proto_strategies
in my_strategies.py
. Can you share this file and a test that raises this error?
@Zac-HD The initialization is done lazily, so it shows up below my code in the stack, but it's in fact initialization of hypothesis. Note that the slow code is contained in hypothesis.internal.charmap.charmap
, which does lazy, memoized initialization from (what appears to be) a gzipped pickle of unicodedata.
I'm unable to reproduce the slowdown except on some corporate hardware, but if you put a sleep 10 in that function (above) you'll see the same behavior.
I still can't reproduce this problem, but I suspect that forcing wrapped strategies to be initialised before we start the timer might help.
diff --git a/src/hypothesis/internal/conjecture/data.py b/src/hypothesis/internal/conjecture/data.py
index 361a73d4..1b6a8827 100644
--- a/src/hypothesis/internal/conjecture/data.py
+++ b/src/hypothesis/internal/conjecture/data.py
@@ -169,6 +169,10 @@ class ConjectureData(object):
at_top_level = self.depth == 0
if label is None:
label = strategy.label
+ else:
+ # initialise wrapped strategy *before* timing draws. See:
+ # https://github.com/HypothesisWorks/hypothesis-python/issues/1153
+ strategy.label
self.start_example(label=label)
try:
if not at_top_level:
Can you try this and let me know if it helps?
Also: while it's not a solution, you can work around this problem by calling hypothesis.internal.charmap.charmap()
in your conftest.py
or other setup code, or st.text().example()
if you don't want to touch internals. I'm not claiming that's a solution, but it would let you keep working while we work this out.
Unfortunately, initializing wrapped strategies before starting the timer did not help.
I ended up implementing a strategy for generating text fields in the meantime, using primitive hypothesis strategies as a temporary work around.
Is strategy.label
meant to cause charmap to be called? How would that be?
@bukzor - accessing the attribute forces reification of a LazyStrategy
, which in turn means that __init__
of the underlying strategy (containing calls to charmap()
) actually gets called. It's a complicated scheme, but allows errors in use of Hypothesis to show up as localised test failures rather than as an error collecting the module.
@heyitsvic - I really want to get to the bottom of this, but I just can't reproduce it. Can you share an example test which fails if you add a slowdown to charmap()
? Here's a simple one that should have the same problem, but doesn't:
@given(characters(blacklist_categories=['Cs', 'Cc']))
def test_charmap_slowdown(t):
pass
I can make this take as long as I like by inserting time.sleep
calls in charmap
, but I can't get any of the time-related health checks to fail :confused:)
As-is, all I know is that it's happening with a test you won't show me, using custom strategies you also won't show me, and I can't reproduce the problem. That suggests the problem is in your strategy, not Hypothesis internals (ref: #434), so unless we can make some progress I'm inclined to close the issue.
(note that if NDAs etc are required, @DRMacIver and I both offer consulting services for corporate users)
The st.text().example()
you suggested is a fine work around. I'm working on getting a repro back to you.
Sorry for the delay!
I managed to reproduce the error by adding a time.sleep(2)
in charmap's initialization. In the error message, we can see that the single generated example has a total draw time of 2.1x, suggesting that charmap startup time is being bundled in with the strategy's draw time.
The interesting thing is that the 'too slow' health check seems to only be active after a certain number of draws.
import hypothesis
from hypothesis import strategies as st
draw_count = 0
@st.composite
def MyStrategy(draw):
"""The 'extremely slow' health check is only active on the 10th through 18th draws."""
global draw_count
print('DRAW!')
draw_count += 1
if draw_count == 12:
result = draw(st.text())
else:
result = draw(st.integers())
print('drew %r' % result)
return result
@hypothesis.given(MyStrategy())
def slowdown(text):
print('TEST!')
del text
diff --git a"hypothesis/internal/charmap.py
--- a"hypothesis/internal/charmap.py
+++ b"hypothesis/internal/charmap.py
@@ -51,6 +51,9 @@ def charmap():
# filesystems is fairly simple: check if loaded, else try loading,
# else calculate and try writing the cache.
if _charmap is None:
+ print('SLEEPING...')
+ import time
+ time.sleep(2)
f = charmap_file()
try:
with gzip.GzipFile(f, 'rb') as i:
diff --git a"hypothesis/internal/conjecture/engine.py
--- a"hypothesis/internal/conjecture/engine.py
+++ b"hypothesis/internal/conjecture/engine.py
@@ -341,6 +341,7 @@ class ConjectureRunner(object):
), HealthCheck.filter_too_much)
draw_time = sum(state.draw_times)
+ print("DRAW_TIME: ", draw_time)
if draw_time > 1.0:
fail_health_check(self.settings, (
Traceback (most recent call last):
File "slowdown.py", line 41, in main
slowdown()
File "slowdown.py", line 33, in slowdown
def slowdown(text):
File "hypothesis/core.py", line 863, in wrapped_test
state.run()
File "hypothesis/core.py", line 655, in run
runner.run()
File "hypothesis/internal/conjecture/engine.py", line 480, in run
self._run()
File "hypothesis/internal/conjecture/engine.py", line 876, in _run
self.generate_new_examples()
File "hypothesis/internal/conjecture/engine.py", line 789, in generate_new_examples
self.test_function(self.last_data)
File "hypothesis/internal/conjecture/engine.py", line 288, in test_function
self.record_for_health_check(data)
File "hypothesis/internal/conjecture/engine.py", line 355, in record_for_health_check
state.overrun_examples), HealthCheck.too_slow,)
File "hypothesis/internal/healthcheck.py", line 38, in fail_health_check
raise FailedHealthCheck(message, label)
FailedHealthCheck: Data generation is extremely slow: Only produced 1 valid examples in 2.10 seconds (0 invalid ones and 0 exceeded maximum size). Try decreasing size of the data you're generating (with e.g.max_size or max_leaves parameters).
See https://hypothesis.readthedocs.io/en/latest/healthchecks.html for more information about this. If you want to disable just this health check, add HealthCheck.too_slow to the suppress_health_check settings for this test.
@DRMacIver - having investigated again, I think the problem here is somewhere in the interaction between cachable strategies and the recording of draw_times. I'm stuck, but maybe you'll have better luck?
@heyitsvic - I still can't reproduce your exact issue, but there's clearly something weird going on here. If you run with the --hypothesis-show-statistics
option, what comes out?
I think the problem here is somewhere in the interaction between cachable strategies and the recording of draw_times. I'm stuck, but maybe you'll have better luck?
I'm not sure there's anything mysterious going on. I think this is everything working as intended, it's just that the way it's currently intended to work is bad. 😞
I think it's become easier to hit this recently since my changes that integrated health checks into the core engine, but I'm pretty sure this has always been a problem that we were just able to mostly ignore before.
I think we might want/need to put in a hack (a hack? Us? No! We don't do that do we?) somewhere that globally forces the charmap to be loaded before the first @given
is run.
I still can't reproduce your exact issue
Is it possible you're trying to reproduce the issue inside the Hypothesis test suite where we both mess with time and also pre-warm the cache?
That being said, I also can't reproduce the exact issue. The following variant reliably fails for me if I put a time.sleep(2)
in charmap as @heyitsvic suggested and delete the .hypothesis directory before running it:
import hypothesis
from hypothesis import strategies as st
@st.composite
def MyStrategy(draw):
if draw(st.booleans()):
draw(st.text())
@hypothesis.given(MyStrategy())
def slowdown(text):
pass
if __name__ == '__main__':
slowdown()
The reason for the draw(st.booleans())
is that we initially draw the minimal example outside of the health checked region. If that minimal example were to draw from text
this would cause us to avoid the problem, but in this case it doesn't because we only draw when the initial boolean is non-zero.
I was definitely running outside our test suite, in a new project directory. (the same problem occurred to me :smile:)
I also stumbled across the minimal example not being health-checked. Is there a principled reason for this? Because it seems to me that if health checks would fail for the minimal example, we ought to fail right then - in fact I thought that this was part of the point (for eg large buffers)!
Is there a principled reason for this? Because it seems to me that if health checks would fail for the minimal example, we ought to fail right then - in fact I thought that this was part of the point (for eg large buffers)!
No principled reason that I can recall. Moving the minimal example inside the scope of other health checks seems totally reasonable to me.
@Zac-HD How come deletion? I was trying to gather opinion on maintainers' preferred solution, so that I can submit a patch that has a chance of being merged.
Because we already pause the draw_times timer for this - the issue is a fairly subtle problem with health checks in the engine that I don't personally understand well and strongly recommend leaving to David.
If you're looking for a PR, can I suggest the bug
label? :smile: In particular I gave up on 966, 961, and 944, but you've shown some great analysis before and might have a better chance.
I don't understand yet what causes this slowness, but shouldn't slowness in the initialization of hypothesis itself be excluded from timing of my strategy? When this happens, I get this failure:
See the profile here:
https://gist.github.com/heyitsvic/8baf5410b23b16f4679766c8b1657470