Open alexander-held opened 3 years ago
Here is an example illustrating how the object
typing can help:
from typing import Any
import numpy as np
import pyhf
def expected_data(model: pyhf.pdf.Model) -> object:
return model.expected_data(model.config.suggested_init())
def to_numpy(data: Any) -> np.ndarray:
return pyhf.tensorlib.to_numpy(data)
model = pyhf.simplemodels.uncorrelated_background(
signal=[12.0, 11.0], bkg=[50.0, 52.0], bkg_uncertainty=[3.0, 7.0]
)
pyhf.set_backend("jax")
expected_data(model).sort() # code runs, but flagged in mypy
pyhf.set_backend("tensorflow")
expected_data(model).sort() # code fails, and flagged in mypy
to_numpy(expected_data(model)).sort() # no issues
This wraps expected_data
to illustrate its behavior with object
return type. mypy
catches the .sort
call in the two cases where no explicit numpy
conversion took place. The first call happens to work, but this is not guaranteed for all backends. The second one fails with TF, and mypy
caught this issue without having to run the code. Everything is fine after the explicit numpy
conversion.
Assuming that the user forgot to do the conversion, the object
type would have alerted the user about its necessity via mypy
and automated tests for all backends would not have been needed.
Some more information from discussions with @kratsg and @phinate. Imagine a user writing a library that uses pyhf
, and the user wants their library to be typed. Unless they need to keep using tensors in their library (e.g. for autodiff reasons), then typing in that library becomes significantly easier if all 0-d tensors are immediately converted to floats, and all other tensors to numpy
arrays or similar. They may want to implement something like the following:
def significance(model: pyhf.pdf.Model, data: List[float]) -> float:
return float(pyhf.infer.hypotest(0.0, data, model, test_stat="q0"))
At the moment, this function is perfectly fine. When running mypy
, there is no information about what hypotest
returns so the float-wrapping is no issue.
Let's imagine now that the pyhf
return was typed. This function returns a tensor, so it is not obvious that the float()
call works (unless the shape information is also provided). It would therefore be useful to have a pyhf.tensorlib.to_float
function that converts its argument if it can, and raises an exception otherwise.
Why is it useful to have the return from pyhf
typed, and provide wrapping utilities? Imagine the user implements the following function:
def get_data(model: pyhf.pdf.Model) -> List[float]:
return model.expected_data(model.config.suggested_init()).tolist()
Without type annotations for expected_data
and when using the numpy
backend, everything is fine. mypy
does not know that the expected_data
return value does not have the tolist
property for all backends, and with the numpy
backend everything works out.
A tensor type annotation (or even collections.abc.Iterable
) for expected_data
would flag the issue: tolist
will not always work. Without having to deal with runtime testing different backends at all, mypy
will flag the problem and the user can solve it by wrapping the return value in pyhf.tensorlib.to_numpy
, which should take a tensor and return a numpy
array (or raise an exception if it cannot).
As a user, this pattern is very powerful. If my code is fully typed and I check types at runtime as well (e.g. with typeguard
), I do not have to think about the different pyhf
backends. If I make a call that returns a tensor and forget to wrap it in to_numpy
/ to_float
, then I would either
numpy
array.This error will inform me that I forgot to wrap the tensor to get a guarantee about its type, and it crucially works without testing all backends. If my code is fully typed, runtime type checks succeed, and my code passes all tests for a single backend, then I am guaranteed that it will work for the remaining backends as well. Moving the consistency checks for backends fully to pyhf
instead of having to repeat them in other libraries can save a lot of overhead.
1 This happens because the return type will presumably be not more precise than a union of the four tensor types. If the dynamic backend switching will result in the type getting narrowed down further (to e.g. a numpy
array when using that backend), this pattern does no longer work.
I do think some typing would be useful, but I would like the typing to be "semantic" and backend independent..
i.e.
def logpdf(self: pyhf.Model, pars: pyhf.typing.Tensor, data: pyhf.typing.Tensor) -> pyhf.typing.Scalar:
return pyhf.tensorlib.to_numpy(data)
would this be sufficient?
I think this may depend on the implementation of pyhf.typing.Scalar
, but the typing does not need to be particularly precise. Making it anything other than Any
(which will never cause issues downstream in mypy
) should provide most/all of the benefits mentioned above I think.
Note for me in the future if I'm working on this: There is a video by Anthony Sottile on gradually applying levels of typing with mypy
(no captions): gradual typing python (and my approach) (beginner - intermediate) anthony explains 308
Description
I would like to propose the addition of typed return values for the parts of the
pyhf
API that return objects with backend-dependent types.See the following example as a motivation:
mypy
does not know anything about the return ofexpected_data
and assumes the type isAny
. The call of.sort()
therefore is no problem formypy
.mypy
passes for the above, and there are no issues at runtime either. For a user who never switchespyhf
backend, everything is fine.Imagine now a user who calls
pyhf.set_backend("jax")
beforef().sort()
.mypy
still has no issue with that, and the code runs fine too. Type checking at runtime could catch a problem here: the return type off
is no longer anumpy
array.A bigger issue appears with
pyhf.set_backend("tensorflow")
:mypy
passes, but at runtime this fails. The reason is that the TF tensor does not implement.sort
.One solution is wrapping
prediction
viapyhf.tensorlib.to_numpy
before returning it. That guarantees the return type off
to be anumpy
array. While this is a solution, the user would have to remember this. If they do not, it would be desirable to automatically catch the potential issue.How could the issue with TF possibly have been caught? I see two options:
pyhf
backends,expected_data
.Option 1: testing all backends The first option is already possible now. The major downside in my opinion is that it can add a lot of computational overhead for potentially computationally expensive integration tests. These tests for all backends fit better into the
pyhf
test suite in my opinion, where they are already performed.Option 2: typing The issue with the code above could have been caught by static analysis if the return type of
expected_data
was notAny
. For the most general case, assume that the function would be annotated with anobject
return type (see:Any
vsobject
).mypy
would then catch thesort
call statically, and the user could know that they forgot to wrap the return value off
withpyhf.tensorlib.to_numpy
. This is computationally much cheaper and does not add overhead to tests in libraries that depend onpyhf
.object
of course is too general, and the tensor return type definition could be narrowed down further in the future. The most generalobject
already would help catch the above. A possible downside of this is that people who strictly use thenumpy
backend would still get amypy
error even though the return type ofexpected_data
in that case is actually anumpy
array. This can be avoided via# type: ignore
comments, but it seems more tricky to communicate that the return type can dynamically change depending on the backend. This however does not seem crucial to me, but rather a nice-to-have feature.Is your feature request related to a problem? Please describe.
See above: static analysis could catch issues with without having to run computationally expensive multi-backend tests.
Describe the solution you'd like
Basic typing of the API with backend-dependent return types.
Describe alternatives you've considered
Running tests for all backends of
pyhf
.Relevant Issues and Pull Requests
1284 for general typing, #948 for the accompanying PR
this came up via https://github.com/alexander-held/cabinetry/pull/228
Additional context
Add any other context or screenshots about the feature request here.