kushalkolar / scikit-ophys

Trying to alleviate the pain associated with calcium imaging analysis
GNU General Public License v3.0
0 stars 0 forks source link

API brainstorming #1

Open kushalkolar opened 1 week ago

kushalkolar commented 1 week ago

From reading the sklearn design decisions paper:

All transformers must be symmetric in their input and output. Arbitrarily pass the output of one transformer into another.

Any post processing has to be done with the output of an estimator, NOT by the estimator itself. Following the sklearn design principles, a model/estimator should be a discrete unit. It takes parameters and inputs, and fits a model. That is it. Any curration of the model must be done elsewhere, for example with CNMF filtering components based on other parameters would be done by another object, perhaps a predictor or another estimator? -> This would also make unit testing much much easier! And easier to build visualization.

Any matrices or vectors uses for initializing a model must be a parameter to the constructor. For example: https://scikit-learn.org/stable/modules/generated/sklearn.decomposition.SparsePCA.html#sklearn.decomposition.SparsePCA

Must support multiple backends (cpu, gpu) and lazy compute, especially for visualization. Perhaps dask. Keep WGPU for compute in mind from the beginning. Follow sklearn development on array-API stuff.

kushalkolar commented 1 week ago

OK I've read through the 2013 design doc paper, need to go through this next: https://scikit-learn.org/stable/developers/develop.html#rolling-your-own-estimator