Closed Lilly-May closed 3 months ago
I would like to add documentation examples for the regression classifier. But I also think we should keep the current examples using the MLP models. @Zethson is it okay if I simply add a second example in the same docstring?
Attention: Patch coverage is 83.33333%
with 8 lines
in your changes are missing coverage. Please review.
Project coverage is 63.52%. Comparing base (
916c837
) to head (7fd7bbe
). Report is 2 commits behind head on main.
I would like to add documentation examples for the regression classifier. But I also think we should keep the current examples using the MLP models. @Zethson is it okay if I simply add a second example in the same docstring?
Yes, please! Also thought about that while reviewing.
What was your impression concerning usage and parameter documentation? Was it too annoying to always be like: "This only applies to the MLP" or the other way around? I'm trying to assess whether we should split them into two functions or roll with what you nicely implemented.
If we want to stick with the DiscriminatorClassifierSpace
class, I would keep the implementation as it is in this PR, with regression as a parameter choice. The downside of this is that the implementations are quite different (in each method, I have an if-else statement which separates between MLP or regression), but that isn't really of interest to the user. So, the fact that several parameters are not applicable for the respective model might be the bigger problem.
I think the alternative would be to have two different classes, something like MLPClassifierSpace
and RegressionClassifierSpace
. For the latter, we would probably only have one method, compute
, analogous to the other Perturbation Spaces instead of load
, train
and get_embeddings
. Personally, I think this approach might be a bit more intuitive and easier to understand for users, but it's not backward compatible, as the DiscriminatorClassifierSpace
would be removed.
I think the alternative would be to have two different classes, something like MLPClassifierSpace and RegressionClassifierSpace. For the latter, we would probably only have one method, compute, analogous to the other Perturbation Spaces instead of load, train and get_embeddings. Personally, I think this approach might be a bit more intuitive and easier to understand for users, but it's not backward compatible, as the DiscriminatorClassifierSpace would be removed.
Concerning backwards compatibility: We could alias the classes. A simple DiscriminatorClassifierSpace = MLPClassifierSpace
in an __init__.py
would probably do the trick.
I think that if we really wanted to we could probably provide a somewhat sane load
and get_embeddings
also for the RegressionClassifierSpace, but they'd be super simple, right? I'll leave this up to you to judge and we'll roll with whatever you think is best.
But yeah, splitting this into two is I think the better approach
PR Checklist
docs
is updatedDescription of changes
DiscriminatorClassifierSpace
object. By default, it's set to MLP, ensuring backward compatibility with the previous usageDiscriminatorClassifierSpace
: The adata is now provided via a fixture and is subsequently used by both the MLP and the regression classifier testing methodsTechnical details
I tested the regression classifier implementation using the Norman dataset using the following code:
Which results in the following UMAP:![umap](https://github.com/theislab/pertpy/assets/93096564/b356d964-e881-48f1-97b7-73740f0487bf)