Open adrinjalali opened 5 years ago
Hi @adrinjalali. Apologies for poor communication about this. I’m reopening this issue because, after feedback from many other users, it sounds like stronger scikit-learn support (and probably pandas to a further degree) would be greatly appreciated. There are still a bunch of challenging technical issues that we would need to resolve which has stalled progress in this direction though. I and the rest of the team would love to have a chat with you and anyone else with ideas about this to see if we can move forward with integrating scikit-learn better with AIF360. This discussion is probably best served in a call of some sort which we can summarize afterward but this is up to you.
Happy to see this going forward! I've been working on it on the fork I've made from the project, but as you say it's not an easy task, and there are a lot of things which are just done very differently here compared to scikit-learn.
However, for some of the challenges, we talked about them during the scikit-learn sprint in February in Paris, and some of the related issues are going forward.
I'd be happy to have a call, we can coordinate on your slack channel, where I'm always present.
We had a good conversation about this today. Here’s a summary of what we discussed and next steps:
Challenges/workarounds:
Next steps:
Future considerations:
Feel free to add anything I've missed.
@hoffmansc would it make more sense to do this in branch on sklearn? Or at-least create a major feature request there, which drives most of the conversations and points back here? With this we would be able to get more from sklearn community be able to look at what`s happening around fairness and bias, and be able to contribute ideas, code etc... ? @krvarshney
@animeshsingh I'm not sure what you mean by a branch on sklearn. I don't think this would get in the main sklearn repo. For these projects, we use the scikit-learn-contrib collection or projects to give more visibility to those projects, and aif360, or a variation of it, would be a perfect match there.
There are some constraints that we need to satisfy before that. For instance, I don't think we should have a hard dependency on tensorflow, specially since it doesn't even support python3.7 yet, and by the time it does, python3.8 is probably out. Also, there seem to be some focus on python2.7 support here which is ditched already in many parts of the ecosystem (pandas and sklearn included). We're planning a release in the coming week and it supports python3.5+.
But in general, I like the idea of getting this closer to the scikit-learn-contrib realm.
Thanks @adrinjalali for the response. The intention was not to move the core AIF360 there - but essentially opening up a feature request in sklearn community around having a default bias checker and mitigator, and then pointing back to the branch you or @hoffmansc has created from AIF360 here. This way the community there can be made aware of the effot
I finally started some work on this in the sklearn-compat
branch.
In the README for aif360.sklearn
I put together a rough to-do list/roadmap. A lot of it is straightforward and I think almost all of it can be done with tricks/workarounds but explicit scikit-learn support for some of these things would be better.
Feedback and contributions are needed and we should probably advertise this in some way in case other people want to contribute.
Thanks @hoffmansc. @adrinjalali it would be great to sync up once on this
The
inprocessing
algorithms are basically like anEstimator
. Ideally, it should be possible to replace a classifier in a scikit-learn pipeline, with one from aif360. An example pipeline (taken from this example) looks like:But when we move to use a model such as
PrejudiceRemover
, we have to break the pipeline and have the model separate, since it doesn't follow the API requirements to be fit for a pipeline.Once it can be fit in a pipeline, then we can use all the other mechanisms already available in sklearn, such as
GridSearchCV
to find best hyperparameters for the problem at hand.That also brings are to the scorers. AIF36's scorers also don't fit into the scoring mechanism of sklearn. Once they do, we could use functions such as
make_scorer
to create a scoring function and feed it into the sklearn'sGridSearchCV
for instance.This may not be a trivial task, and some useful things such as having multiple scoring functions recorded and reported by the grid search are still in discussion and not yet available in sklearn. Until then, we can provide an easy way for the users to combine antibias scoring functions with performance scoring ones and use them to choose their best pipeline.
Another point regarding the API conventions is that if the
preprocessing
modules also fit in thesklearn.Pipeline
as a transformer, then we can put their selection also in a hyperparameter search and do the search much easier than having to manually run them one by one and going through them to find the best solution.Right now transformers which would change the number of samples or change the output are not supported in sklearn (AFAIK), but that's also in discussion and this usecase may be a good push for it.