Closed wptmdoorn closed 3 years ago
For Regression, it would make sense if the calculation function return its output: i.e.
slope, intercept = Regression(method1, method2).calculate()
rather than
r = Regression(method1, method2)
slope = r.stastistcs["slope"]
intercept= r.stastistcs["slope"]
@thomasburgess
Could you look at my commit of today and tell me if you approve with this structure of the API? If we find a consensus here, I will finalize the BlandAltman class and merge it into the master. From then on, we can start modifying the other functionality of the API (e.g. Regression and Surveillance Error Grids) to match the new structure.
The general idea (as we discussed on mail before) is:
.compute()
function. I would prefer to store this in a dictionary (regardless of the class), as this provides the most consistent API. Also, the results of the function are stored internally in _output
variable (see Line 52-58 in the revised BlandAltman.py). .plot()
function. Importantly if the user computed the statistics before, it will use this, otherwise it will first call .compute()
internally to generate the statistics (see L82-L83 in the revised BlandAltman.py). Would you be able to look at this and share your thoughts? I really appreciate it, thank you!
The general ideas are sound
I would propose to keep the main functions that exist in regression.py, but get rid of the classes. Then call Regressor classes from regression.py. This way Regressor is usable on its own, all while not breaking backwards compatibility.
Make a pull request for BlandAltman, and I can give you line by line review of the code. I will update the regressor class to be closer to this idea. It could be that we can use the same base class for BlandAltman and Regressor, but that can be done once the updated BlandAltman has been merged.
I will create a pull-request now for the separation branch, thanks a lot @thomasburgess
Merged, so issue closed
This issue follows issue #1 to have a broader discussion about how to extend the API to not only support plotting but also support calculation for each of our functions (regression, Bland-Altmans and surveillance error grids). I just committed a first example in a separate branch in https://github.com/wptmdoorn/methcomp/commit/7ff14e0026d794696c2d22fb8f861df4240c593f. This code is very preliminary and is most likely suboptimal but is just here to provide with some food for thought. As exemplified in
examples/blandaltman.py
it works like this:statistics()
plot()
Major benefits is that we also separate the arguments needed for calculations (e.g. the data, confidence intervals, and so on) from the actual markup of the plot (e.g. color points, graph title). Very happy to hear feedback!