Open samarpan-rai opened 3 years ago
Normally, you would consider subclassing because parts can be re-used. Scikit-learn has a ClassifierMixin
so that all classifiers are consistent for example.
The thing with this package though ... it's really just a single class. Splitting it up feels like it would add complexity instead of removing it. There's something to be said to perhaps group methods together. But I don't know of a clean way to split up a single class across multiple files.
Yes, I can see that sub-classing may not be ideal but what if we split the methods based on their functionality. I am not too sure about about Mixins yet.
However, we can split the reading functions in a different file such that we can call them without instantiating the class. I think this also touches on issue #81. This is what pandas does when it loads as seen in their dunder init.py.
import clumper
clump = clumper.read_csv(...) # holds for other read functions
Just moving the read functions can remove 300+ lines from the Clumper
class.
Is your feature request related to a problem? Please describe. Our main class is becoming a monolith. Currently
Clumper
class is over 1500 lines. The major contributor is the documentation but it still makes it difficult to navigate through while developing.Describe the solution you'd like Split the class into multiple methods and/or classes. I think we already have a good structure based on tests by functionality. For example, we can split into following system by functionality