Any post processing has to be done with the output of an estimator, NOT by the estimator itself. Following the sklearn design principles, a model/estimator should be a discrete unit. It takes parameters and inputs, and fits a model. That is it. Any curration of the model must be done elsewhere, for example with CNMF filtering components based on other parameters would be done by another object, perhaps a predictor or another estimator?
-> This would also make unit testing much much easier! And easier to build visualization.
Must support multiple backends (cpu, gpu) and lazy compute, especially for visualization. Perhaps dask. Keep WGPU for compute in mind from the beginning. Follow sklearn development on array-API stuff.
From reading the sklearn design decisions paper:
All transformers must be symmetric in their input and output. Arbitrarily pass the output of one transformer into another.
Predictors
Estimators
Any post processing has to be done with the output of an estimator, NOT by the estimator itself. Following the sklearn design principles, a model/estimator should be a discrete unit. It takes parameters and inputs, and fits a model. That is it. Any curration of the model must be done elsewhere, for example with CNMF filtering components based on other parameters would be done by another object, perhaps a predictor or another estimator? -> This would also make unit testing much much easier! And easier to build visualization.
Any matrices or vectors uses for initializing a model must be a parameter to the constructor. For example: https://scikit-learn.org/stable/modules/generated/sklearn.decomposition.SparsePCA.html#sklearn.decomposition.SparsePCA
Must support multiple backends (cpu, gpu) and lazy compute, especially for visualization. Perhaps dask. Keep WGPU for compute in mind from the beginning. Follow sklearn development on array-API stuff.