Refactor model API for abstract parallelization

gibsramen commented 3 years ago

Closes #51

Important changes:

Removed RegressionModel subclass
BaseModel is now an abstract class (cannot be directly instantiated)
BaseModel has a create_regression method that behaves equivalently to the previous RegressionModel
Added two subclasses - TableModel and SingleFeatureModel for serial & parallel models respectively
Added a ModelIterator class
Changed behavior of specify_model to save params, coords, etc. as attributes.
- Other parameters such as alr_params can be passed and will be saved in the self.specifications dict
- This was done so we are not limited to ALR models in TableModel. Will likely need to update/decompose full_fit_to_inference in the future.

mortonjt commented 3 years ago

Yes, definitely save ilr for later. I think the more important thing for this PR to try to nail down the naming scheme and avoid implementing road blocks as highlighted above

On Sat, Jul 3, 2021 at 2:25 PM Gibs @.***> wrote:

@.**** commented on this pull request.

In birdman/model_base.py https://github.com/gibsramen/BIRDMAn/pull/52#discussion_r663405394:

@@ -280,103 +208,69 @@ def to_inference_object(self) -> az.InferenceData: args = { k: self.specifications.get(k) for k in ["params", "coords", "dims", "posterior_predictive",

"log_likelihood"]

"log_likelihood", "alr_params"]

Gotcha - yes I can see how this could get hairy fast...

Do you think we could save this for a different PR? There's probably some clever inheritance stuff we could do (multiple-inheritance maybe?) but I'd like to get this refactor implemented first.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/gibsramen/BIRDMAn/pull/52#discussion_r663405394, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA75VXLULNP7M3UEM2WULE3TV5W2VANCNFSM47VQA3UA .

gibsramen commented 3 years ago

Thinking about this some more I'm not sure I fully understand the need to rename/create ALRModel. If we remove alr_params from TableModel.specifications isn't that sufficient? A CLRModel would be able to use the same inference conversion but specifying no fitted parameters are in ALR coordinates. (And ILRModel we can figure out later.)

mortonjt commented 3 years ago

Yes I think that’ll be the solution for now.

On Sat, Jul 3, 2021 at 3:31 PM Gibs @.***> wrote:

Thinking about this some more I'm not sure I fully understand the need to rename/create ALRModel. If we remove alr_params from TableModel.specifications isn't that sufficient? A CLRModel would be able to use the same inference conversion but specifying no fitted parameters are in ALR coordinates. (And ILRModel we can figure out later.)

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/gibsramen/BIRDMAn/pull/52#issuecomment-873473104, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA75VXOO6N4E6RJC23TB2M3TV56R5ANCNFSM47VQA3UA .

mortonjt commented 3 years ago

Right ... I wonder if we even need the alr_params here right? Can we just remove it completely? And would we still be able to obtain inference objects if we do so

On Tue, Jul 6, 2021 at 9:55 AM Gibs @.***> wrote:

@.**** commented on this pull request.

In birdman/model_util.py https://github.com/gibsramen/BIRDMAn/pull/52#discussion_r664683330:

@@ -9,7 +9,7 @@ from .util import convert_beta_coordinates

-def single_fit_to_inference( +def full_fit_to_inference( fit: CmdStanMCMC, params: Sequence[str], coords: dict,

Or actually probably cleaner to just assign to an empty list if alr_params == None

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/gibsramen/BIRDMAn/pull/52#discussion_r664683330, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA75VXOUKJVV2KSZ4R5UXDLTWMRO5ANCNFSM47VQA3UA .

gibsramen commented 3 years ago

Removing alr_params from full_fit_to_inference would require some rethinking. Couple of options below

1) Specifying values in coords to be D-1 rather than D for the default (ALR) models and performing ALR -> CLR in e.g. NegativeBinomial.to_inference_object. In this case full_fit_to_inference would essentially be a light wrapper for az.from_cmdstanpy (similar to how it is currently done in the single feature equivalent). 2) Having full_fit_to_inference take some sort of optional transformation function/callable for post-processing (this could I guess be used to build ILR/cosh/etc. in an included transform.py module). 3) Leave any transformations up to the user

I think (2) is the best approach but open to other options.

mortonjt commented 3 years ago

Let's try to flesh out option 2 a bit more.

If I'm understanding this correctly, the whole problem stems from az.from_cmdstanpy where the coords for the features needs to be passed in correctly?

Wouldn't each model have their own coords defined? ALR / CLR / ILR should have their own feature names specified in these models, so I would think that these models should be passed in directly to az.from_cmdstanpy. So no need for additional transforms, the raw model parameters would be returned.

Regarding the alr to clr conversion, I recommend creating a separate method that does that -- this could take in an inference object in alr coordinates and return a clr transformed object. This should further simplify the interface. I think this would also be a bit easier from a user perspective -- we can provide helper functions that can manipulate the InferenceData objects to do the coordinate conversations afterwards.

gibsramen commented 3 years ago

Ok - that sounds like a plan. We should have full_fit_to_inference do no transformation on its own and handle that in a separate function/module.

I'll address that in a new PR - is there anything else you think should be addressed in this PR before merging?

mortonjt commented 3 years ago

👍 I think this is ready to merge.

biocore / BIRDMAn

Refactor model API for abstract parallelization #52

@.**** commented on this pull request.

@.**** commented on this pull request.