Open jklaise opened 3 years ago
How could we add an explainer to this inference graph?
Point directly to the model component
This capability to point the explainer to start at a particular node in an inference pipeline is envisioned for SCV2.
Non-compliant models within an inference graph
Not all inference graphs contain a model node that would be Alibi-compliant, so in the general case the above would not work and it would be necessary to either:
* Extend Alibi compliant data types to support a wide variety of use cases / inference graphs
For the various transformation functions of this section and the one of the last I would see these probably being functions in Alibi if possible. So callables passed to the init of the explainer.
We could investigate the explainer
custom resource in SCV2 being an inference graph of itself to allow images to be used which would run in separate containers. With pre
post
and explainer
sections. Where pre
has the compliant transformer. post
has the compliant post-transformer, and explainer
has the core explainer?
@cliveseldon if transformation functions for common use cases are built-into alibi
then we also have a choice of not having a very general API that accepts custom callables to do custom transformation and inverse transformation (although we may want to support this for genericity). Rather we could dispatch to pre-defined alibi
transformations given some information about the data.
This brings me to the data aspect. In order to do this, we will need more information about the data from the user. A good example is actually #487 which is not a tabular use case, but the principle is the same. An "image" may have a channel dimension in different places or it may not have one at all (e.g. grayscale image with no explicit channel axis). These should all be valid inputs to an image predictor that's supported by AnchorImage
, but the user has to tell us what the data is (e.g. in this case via image_shape
and the proposed channel_axis
kwargs).
If we go with the model of "ask user about data and pick an appropriate transformation function" then we will need to carefully design what it is we need to know about the data and in what format (e.g. for the AnchorImage
example it's already at least 2 kwargs, for AnchorTabular
it may be more - how do you express the concept of "a numpy array with strings in categorical columns"?).
If we go with the other model of "allow custom transformation and inverse transformation functions" (this is similar to alibi-detect
preprocessing_fn
, we would provide implementations of common ones), then it's a lot more general and we don't tie ourselves into a fixed API, the trade-off is having these extra callables "floating around" (but maybe it's not such a big issue given that we do have a similar design in alibi-detect
already).
I feel in general that we would need to start thinking about how a user can inject predict_fn
, pre
, post
transformation to explainers as opposed to trying to have everything done in alibi. There are a few advantages of this approach in my mind:
Can tempo help with enabling the user to supply these transformation boxes?
On the other hand, if in deployment these transformation are separate pods, there is going to be extra overhead because crossing process boundary and for simple transformation this might be a significant overhead.
So we probably need to find the right balance and I personally I think we have to provide some support for both cases.
@sakoush possibly a versatile option would be to allow alibi
explainers to take in custom callables for those transformations. In that way it's still the user responsibility to define and pass these, but then the application layer doesn't have to worry about how to inject them as it would be done by alibi
, you also then don't have the issue with these pieces of code ending up having to communicate over the network which I agree would likely cause big slowdowns.
Wrt to supporting heterogenous numpy
arrays, this is actually a bit of an oxymoron since numpy
arrays are by definition containers of homogenous data: https://numpy.org/doc/stable/reference/arrays.ndarray.html. This is readily apparent when creating arrays containing both e.g. strings and integers, the dtype
defaults to object
. Another example is that data consisting of strings would default to a dtype
that is roughly "the largest string entry in the array".
Given this it seems that use cases requiring heterogenous numpy
arrays may be using the wrong tool for the job and likely shouldn't be seen as good practice becasue you can't do much with a heterogenous array without extra metadata and further transformations into something homogenous. A pandas
dataframe would be much better suited for such heterogenous data.
A fruitful first exploration would be to take some lessons from libraries like sklearn-pandas
: https://github.com/scikit-learn-contrib/sklearn-pandas.
Will we one day see this issue solved by the Seldon dev team or what can we do to develop this feature?
I really need to have the flexibility to use models in a inference graph like "input-transformers --> model --> explainer -->" or input-transformers --> model ---> --> explainer
And the Seldon Deploy UI is actually put together in such a way that it look like this can be done with Seldon Deployments.
@thorsteen thanks for following up. Just to clarify, I think there are several things going on with your use case. The one related to wiring explainers to inference graphs with transformers is more of a question about the Seldon Deploy side of things. That being said, I understand that part of the issue on the Alibi side, specifically with the data type you're using for your models? It would be great to get more context on exactly the issues you're facing on the Alibi side so we can resolve them first.
A couple of questions from my side:
numpy
arrays (e.g. np.ndarray([49.5, 'some_string'], dtype=object)
) or something else (e.g. pandas
dataframes?)alibi
-compatible and also whether you're looking to explain single components (e.g. just the model) or parts of the whole pipeline (e.g. transformer + model). This will inform whether we can make changes to your pipeline to make it alibi
-compatible as a stop-gap or inform what extensions we need to make to alibi
itself to allow a broader range of input data types (this could likely be implemented as custom user-passed callables to map from the user data type to the alibi
-compatible data type and back).I see there are different usecases for this, but hoping solving this issue will solve my issues and make explainers more flexible in regards to data types.
What I @RafalSkolasinski @FarrandTom discovered is that one is actually not able to deploy a explainer if its getting non complaint data. Locally, I can with ndarray with object dtype if I ordinal encode my inputdata and then use OHE in a preprocessor (so I guess actually ndarray dtype is float64) run something like
predict_fn = lambda x: clf.predict(preprocessor.transform(x))
explainer = AnchorTabular(predict_fn, feature_names, categorical_names=category_map, ohe=True)
like in your example but this cannot be deployed to production.
It would be interesting to have an explainer on the input-transformer side of things, but my issue is that if do not post only numerical / complaint explainer data to the Seldon Deployment one cannot use a explainer for the model. I would like mentioned to be able to use explainers in deployments where there is a input-transformer. Right now this is not covered for production usecases in Seldon Deploy or other deployments I would guess.
I have heard that the with Seldon V2 API, one might be able to detach the explainer from the model - is this also your understanding?
This might help solve the issue of the predict_fn because then one could do the flow "request --> input transformer --> explainer" instead of the flow which is currently "request --> explainer --> model --> explainer" in Seldon Deploy. But this would still require Alibi explainers to handle some mixed data type / ndarray obejct or a preprocessor.
@thorsteen thanks for the context, I think I understand what's going on here but wanted to double check in the following.
Your local example with the preprocessor
which is working would imply that x
here is actually the alibi
-compliant data and preprocessor
transforms it to alibi
-non-compliant data but one that works with clf.predict
, i.e. I'm assuming something like this:
x = np.ndarray([[49.5, 0, 1, 0]], dtype=float) # already pre-processed and alibi-compliant
x2 = preprocessor.transform(x) # x2 = np.ndarray([[49.5, 'some_string']], dtype=object) - "inverse" transform, alibi-non-compliant but model-compliant
output = clf.predict(x2) # model output as probabilities or class labels
Now what I assume you want to see in production is the original model that takes in the non-compliant x2
input which would lead to the following deployment (and failure of explainer):
/predict: request (x2) -> model(x2) -> output # all good
/explain: request (x2) -> explainer(x2) -> model(x2) # not good as x2 is alibi-non-compliant
The general pattern to make your model alibi
-compliant is to follow these docs and wrap your predict function as follows:
def predictor(x: np.ndarray) -> np.ndarray:
x2 = transform_input(x)
output = model(x2) # or call the model-specific prediction method
return output
explainer = SomeExplainer(predictor, **kwargs)
Now this would work locally as you have already tested with transform_input=preprocessor.transform
. But I'm guessing the blocker is that you can't deploy this wrapped model and call it a day because you would still like to deploy the original model that takes in the non-compliant x2
-type data?
In that case we need a way to make explainer(x2)
call work and this is where the discussion about supporting other data types comes in. One option we've been considering is to extended the alibi
explainer interface as follows (pseudocode):
class SomeExplainer:
def __init__(self, model: Callable[[Any], np.ndarray], ..., input_transform: Optional[Callable], input_transform_inverse: Optional[Callable]) -> None:
...
def explain(self, X: Any) -> Explanation:
# transform input to alibi-compliant
X = self.input_transform(X)
# explanation generation...
explanation = ...
# map explanation features in the alibi-compliant space back to the model-compliant space for interpretability
explanation = input_transform_inverse(explanation)
return explanation
This would extend alibi
explainers to handle models taking any type of input data as a user can specify the conversion from their data to alibi
compliant data via input_transform
and back via input_transform_inverse
.
Of course the sticking point might be that this would all work locally but if you want to separate input_transform
and input_transform_inverse
into separate components in the Seldon graph then we would need to re-evaluate how to wire this up properly (I think it could be done similarly to how the reset_predictor
method is used to wire up the explainer to the production model when deployed).
An alternative would be to explore on the Seldon side decoupling of the explainer and black-box-model so that the deployed explainer could point to a wrapped model with the input_transform
already built-in (so taking compliant x
data) whilst the original model would still take in the non-compliant x2
data. This would be akin to how white-box explainers have a separate copy of the model @RafalSkolasinski @axsaucedo .
Would be great to hear your thoughts or if I've misunderstood anything.
Very nice mockup @jklaise, I think you understood it correctly and I like you solution. But I think it would be simpler to decouple the explainer and wrap the model inside and maybe route the request / event in a similar way to the alibi-detectors. With alibi-detectors we dont have the same problem because it is passed the request from the input-transformer. I like to have several decoupled services with a separate input-transformer, model, detector, explainer for the flexibility in the inference graph. In such a deployment, I guess one could make a universal inference graph which would look like this
--> explainer -->
request --> input-transformer --> model --> response
--> detector -->
Locally, I would then just fit the explainer with the complaint / transformed data like I do with the model and detectors and then put this transformation in input-transformer for the Seldon deployment / production.
@thorsteen thanks, I think fundamentally we're trying to address two things here:
alibi
users to work with custom data types. My comment on adding custom input_transform
and input_transform_inverse
transforms straight into the alibi
interface would help with this and even facilitate deployment in simple use cases.What you're essentially describing is my 3rd diagram from the original post:
Conceptually, both /predict
and /explain
calls would take custom/non-compliant data X
but for both endpoints this is routed through the input transformer so we end up with standardized/compliant data Z
.
One thing that is missing from this picture, however, is that the explanation would be in terms of the Z
data so not necessarily interpretable wrt to the original input X
. Consider a simple case where X
has an "Age" variable and set it to 49
. The input transform would likely standardize/normalize it, so you would get "Age~[-1, 1]" in the Z
-space. So your explanation will also be in this transformed space so hard to interpet. To make it interpretable we may need to add an "inverse transform" step that can undo the input transform (hence why I proposed one for alibi
called input_transform_inverse
). Coming back to deployment, this would need to be another step that is called after the explanation has been computed.
Tagging @cliveseldon @axsaucedo @RafalSkolasinski @SachinVarghese
Current status
Currently tabular data in methods such as
AnchorTabular
andCounterfactualProto
are expected to be in one of a set of restricted formats, e.g.np.ndarray
of homogeneous number types where numerical columns arefloat
and categorical columns are integer-encoded (to be pedantic,float
that can be cast toint
without loss of precision, e.g.0.0
denoting the first category of a categorical feature).np.ndarray
of homogeneous number types where numerical columns arefloat
and categorical features have been expanded into one-hot-encoded columns (i.e. each categorical feature now occupiesn_categories
columns which are populated with0
and1
entries)Problem
If a user model is not trained on a data representation that is one of the above then Alibi tabular explainers cannot be used out-of-the-box which is undesirable (as found out by @FarrandTom).
For concreteness, denote by
X
an input data point that is non-compliant with the Alibi API, e.g. it could benp.ndarray
but with unsupported column types, for examplearray([49.5, 'Male'], dtype=object)
representing a numerical feature and a string-encoded categorical variable.Further, denote by
Z
an input data point that is compliant with the Alibi API, e.g.array([49.5, 0. ])
representing the same numerical feature and the same but integer-encoded categorical variable.A client may have a model
M
that's trained on non-compliant data, i.e. it would be of typeCallable[[X], np.ndarray]
, whereas Alibi expects a modelM_hat
(prediction function) of typeCallable[[Z], np.ndarray]
. How can we go from a non-compliant model to a compliant one?The key is being able to map back and forth between
X
andZ
. Letf: X->Z
be such an invertible mapping, for the example above it would be something like:With this extra information we can define an Alibi-compliant model in terms of client model
M
and inverse mappingf_inv
as follows:M_hat = M(f_inv(Z))
, in PythonWhat we can do
f, f_inv
transformations to make an Alibi non-compliant model into a compliant one, specific examples are needed. This is in spirit the same as the discussion on white-box vs black-box models. This has short term gains demonstrating that in principle it should always be possible do use Alibi explainers if the user is prepared to do a bit more work. It is unclear, however, if this translates well to the deployment setting.np.ndarray
data. It may be useful to extend this to take in alsopd.DataFrame
and/orpd.Series
objects as necessary. This has longer term gains as the user would no longer need to do extra work in case their model is trained on a different data representation.What about deployment?
In deployment we may have the following situation where the inference graph consists of a transformer mapping an Alibi-non-compliant data
X
to a compliant oneZ
which is then passed into an Alibi-compliant model:How could we add an explainer to this inference graph?
Point directly to the model component
If we know the model component is Alibi-compliant, we could point the explainer to that instead of the whole inference graph (which is non-compliant):
However, note that in this scenario the explainer expects the compliant data type
Z
whilst the inference graph operates on the original data typeX
. To obtainZ
fromX
we would need to leverage the existing transformer so we could extend the inference graph like this (conceptually, implementation details may vary):The only job of the
Explainer-Transformer
component is to call an existing transformer that is known (by the user) to transform non-compliant dataX
into compliant dataZ
.Non-compliant models within an inference graph
Not all inference graphs contain a model node that would be Alibi-compliant, so in the general case the above would not work and it would be necessary to either:
f
andf_inv
but also packaging them as inference graph components:Here the shaded
Compliant-Transformer
andCompliant-Inverse-Transformer
components correspond exactly to the functionsf
andf_inv
defined above but explicitly included in an inference graph (implementation details may be different, e.g. these could live inside theExplainer
as two PythonCallable
s). Effectively we're pointing the explainer to the whole non-compliant inference graph (equivalently, we could also point to the non-compliant model, but there is no point to do that) but we do the conversions to and from compliant data using the new transformer components on the fly (in particular, the inverse transformer intercepts prediction requests and puts them in a format that the non-compliant inference graph can deal with).Tagging a few people who may be interested in the discussion: @FarrandTom @cliveseldon @axsaucedo @SachinVarghese @arnaudvl .