pschanely / hypothesis-crosshair

Level-up your Hypothesis tests with CrossHair
MIT License
8 stars 0 forks source link

Integrating with observability features so maintainers and users can tell what's happening #22

Closed Zac-HD closed 1 month ago

Zac-HD commented 3 months ago

Hypothesis' observability features were originally designed for the benefit of users (https://github.com/HypothesisWorks/hypothesis/pull/3797), but have also turned out to be useful for developers - e.g. this discussion led to bugfixes and performance improvements.

Recent work on https://github.com/HypothesisWorks/hypothesis/pull/4034 has me thinking that observability information from Crosshair would actually be pretty valuable for us as maintainers, to answer questions like:


Practically speaking, what do we actually need to do here?

  1. decide what to measure (schema here)

    • status_reason, if something on the crosshair side was responsible for a Status.INVALID result, as if for assume(False). I considered whether we need custom statuses for e.g. #21 and concluded that the status_reason metadata was a better fit.
    • not features; they're intended to be about the runtime behavior of the code under test.
      • while I'm thinking about it, event() is currently disabled under Crosshair to reduce premature realization, but we could support it nicely by e.g. deferring it to the end of the test case where we realize anyway.
    • timing observations are meant to be disjoint, so that the sum is total runtime. We therefore probably want to put internal timings in the metadata instead.
    • metadata: the catch-all put-anything-here section.
  2. connect it up

    • metadata is easy, just add an optional method to the PrimitiveProvider protocol and call it here
    • we could handle status_reason similarly, with an extra clause here, but I'm not sure what the provider would actually want to say.
  3. use the new information to understand what's going on, improve hypothesis-crosshair, iterate on observability, etc.


Would you find this useful? If so, setting up metadata passthrough would be pretty easy 🙂

pschanely commented 3 months ago

So, the plugin will dump CrossHair's internal debugging if I set DEBUG_CROSSHAIR=1 in the environment, and that already has most (but not all) of what we'd get from this integration. Coverage data, in particular, would be a really nice addition.

Even if I was a user interested in crosshair+hypothesis, I'm not sure it's practical without this kind of observability. Otherwise, I really have no clue whether I should be running any given test under CrossHair or not.

So, I'm very much +1. Some early thoughts:

Zac-HD commented 3 months ago

Sweet - let's do the MVP thing for now: you provide a new method (see below), and I'll hook it up to our observability output 🙂 If collecting the data is at all expensive, you can make it conditional on if TESTCASE_CALLBACKS:, that being a list defined in hypothesis.internal.observability.

Zac-HD commented 3 months ago

https://github.com/HypothesisWorks/hypothesis/pull/4083 is merged! New methods for you to (optionally) implement:

class _BackendInfoMsg(TypedDict):
    type: Literal["info", "alert", "error"]
    title: str
    content: Union[str, Dict[str, Any]]

class PrimitiveProvider:
    ...
    def observe_test_case(self) -> Dict[str, Any]:
        """Called at the end of the test case when observability mode is active.
        The return value should be a non-symbolic json-encodable dictionary,
        and will be included as `observation["metadata"]["backend"]`.
        """
        return {}

    def observe_information_messages(self, *, lifetime) -> Iterable[_BackendInfoMsg]:
        """Called at the end of each test case and again at end of the test function.
        Return an iterable of `{type: info/alert/error, title: str, content: str|dict}`
        dictionaries to be delivered as individual information messages.
        (Hypothesis adds the `run_start` timestamp and `property` name for you.)
        """
        assert lifetime in ("test_case", "test_function")
        yield from []
pschanely commented 3 months ago

This looks great. My next round of releases will include some useful output here.

Zac-HD commented 1 month ago

I think this is done! If we want to track possible additional information, let's make that a separate issue?

pschanely commented 1 month ago

Agree!