nlmixr2 / nlmixr2est

nlmixr2 estimation routines
https://nlmixr2.github.io/nlmixr2est/
GNU General Public License v2.0
7 stars 3 forks source link

Long-term Feature Request: Separate functionality into separate, focused packages #68

Closed billdenney closed 2 years ago

billdenney commented 2 years ago

For this initial issue, the goal is to spark discussion and not to say that the initial thoughts here are the final way that things should go.

The intent of this issue is to seed discussion of a potential to separate nlmixr2 into smaller, more focused packages to aid in code maintenance, testing, and to long-term help simplify the project overall (with a short-term larger effort to do the separation).

From the user side, the goal would be to make nlmixr2 the single package where we would point users for using the package and for documentation. (Meaning, that we would not typically point someone to rxode2 or another package for how to do a typical user-focused task. Advanced users could go deeper into the structure, as needed.)

From the developer side, the packages would be made smaller and more modular allowing easier code modification with clearer interfaces between parts of the code, and to have a clear delineation of package hierarchy and purpose. Testing could be more focused to a specific package, and therefore faster and easier to diagnose issues as they arise. (The separation should try to minimize developer efforts to jump between packages to fix something, hopefully. But, this could be a problem that comes up.)

The structure that I would propose is below with the goal that nlmixr2 would end up with relatively little code itself and it would mainly gather the other packages together in the right ways to make it simpler for the user. What nlmixr2 would have is a lot of documentation.

The nesting of the list is intended to show dependencies between packages, and there should be no circular dependencies (i.e. where a package ends up indirectly depending on itself). All package names should be considered placeholders for clarity (I'm not great at naming things, the rules on package naming are "The mandatory ‘Package’ field gives the name of the package. This should contain only (ASCII) letters, numbers and dot, have at least two characters and start with a letter and not end in a dot. If it needs explaining, this should be done in the ‘Description’ field (and not the ‘Title’ field)." (https://cran.r-project.org/doc/manuals/r-devel/R-exts.html#The-DESCRIPTION-file)).

mattfidler commented 2 years ago

There is a very technical reason why the . doesn't work easily in Rcpp packages. From my perpective smaller names are better

mattfidler commented 2 years ago

I also think John's reporting package should fit in here as well as the babelmixr. They could be suggested and then imported on load-up with startup messages

billdenney commented 2 years ago

@mattfidler , thanks for the thoughts, and I'm not wed to any of the names. My goal was to propose how I thought that the packages could fit together keeping functionality of a package focused and to hopefully enhance testability.

For the fact that rxode is popular enough on its own that it could be an entry-point for some people who do not need the estimation functions, that makes sense. I suppose that I'd then propose to have rxode2 follow a similar mantra to nlmixr2:

From some of your comments, I appreciate that the UI may not yet have the right home. As I understand the UI right now, it is doing many different things, and it may fit best either in something like rxode2ui or somehow in another place. Having worked with the UI code a bit before, it does a lot of heavy lifting to convert ini() and model() into something usable. And, it also does a lot more. I think that it also has a different interface with different estimation methods that makes it even more complex.

My (limited) understanding of the UI is that it does the following things. And, I think that a better understanding of the UI would improve my understanding of where it fits best. I think that various parts may fit in different places depending on what they are doing. For instance, there may be a core UI object that focuses on conversion of ini() and model() to a parsed object, and then additional functions that operate on increased levels of complexity related to those parsed objects. (I think that's what already happens at least to some degree, but I don't know.)

I agree that there are other packages like babelmixr and reporting packages that may fit in other ways. We'll need to think of how they fit in, too. We will need to think of how broadly we want to make the umbrella of nlmixr2 dependencies (where I would think that babelmixr fits in the umbrella as an optional dependency, but I'm less certain if John's reporting package would fit inside or above the umbrella).

mattfidler commented 2 years ago

Hm, It could be

mattfidler commented 2 years ago

The event creation could also be split of to another package

billdenney commented 2 years ago

The rxode2 split makes sense to me (though I think I'd put rxode2plot under rxode2-- possibly a typo?).

And, would event creation fit into the rxode2ui package? It seems like a part of the user interface.

mattfidler commented 2 years ago

The rxode2 split makes sense to me (though I think I'd put rxode2plot under rxode2-- possibly a typo?).

Yes

And, would event creation fit into the rxode2ui package? It seems like a part of the user interface.

Maybe, it is simply helper functions for nonmem-style data creation. I could see a use case where you could have an independent event table creation that could be used by other packages, but perhaps that is too grand of an idea. They probably wouldn't want to have the rxode2 dependency if this is true...?

mattfidler commented 2 years ago

As for cons, perhaps consider

mattfidler commented 2 years ago

And of course we need to ask why (from Donald Knuth):

Programmers waste enormous amounts of time thinking about, or worrying about, the speed of noncritical parts of their programs, and these attempts at efficiency actually have a strong negative impact when debugging and maintenance are considered. We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%.

To me it is about being able to more simply test certain features of a package, is that correct? Perhaps make it easier for some dependencies to be added

mattfidler commented 2 years ago

So maybe we should consider what is the most critical pieces of this refactoring. The rxode2 split-off is secondary for me, but the nlmixr2 makes more sense with the large amounts of time needed to test different features of nlmixr2

billdenney commented 2 years ago

I definitely don't want to refactor for refactoring's sake.

My goals with the refactoring are:

john-harrold commented 2 years ago

I agree that there are other packages like babelmixr and reporting packages that may fit in other ways. We'll need to think of how they fit in, too. We will need to think of how broadly we want to make the umbrella of nlmixr2 dependencies (where I would think that babelmixr fits in the umbrella as an optional dependency, but I'm less certain if John's reporting package would fit inside or above the umbrella).

Instead of doing all this now, can we decide on a general naming convention? I'm cool something like nlmixr2rpt. But I'd rather figure that part out now so I can change everything and be done with it :). Would we want to do the same with xpose.nlmixr? LIke call it nlmixr2xpose?

mattfidler commented 2 years ago

I agree that a naming convention for most packages would make sense. To me nlmixr2pkg would work.

This is a long term project, I am just trying to figure out what the value for each piece is overall. I can see the value in splitting off an umbrella nlmixr for sure, but am unsure at what point to start/stop with the other pieces. I think that we have to think through any unintended consequences.

mattfidler commented 2 years ago

Looking at the scope of work, renaming rxode2 is much harder than renaming nlmixr2. I propose to rename that and keep rxode2 as the main engine package. If there is another package, like plotting, split it of from rxode2 piecewise.

I will try to get nlmixr2est working

billdenney commented 2 years ago

Alternate name thought for nlmixr2support: nlmixr2extra? It aligns with some other "extra" packages in other parts of the R ecosystem.

mattfidler commented 2 years ago

I like nlmixr2extra since it is smaller.

billdenney commented 2 years ago

@mattfidler brought up a good point in nlmixr2/nlmixr2#1: How do we want to pull packages up to the level of nlmixr2 loading?

For the below description, I'm going to try to be clear when using Depends, Imports, and Suggests as fields in the DESCRIPTION file, and if they don't have code formatting, they are being used more colloquially.

My general thought was that the top-level package, nlmixr2, should try to be robust against being booted off of CRAN. For that, we would want to minimize dependencies in Depends and Imports and their sub-packages. I don't think that a package is booted off of CRAN if a Suggests package is removed or breaks something (because it shouldn't break core functionality in the package).

So, my thought is that nlmixr2 would only have Depends of nlmixr2est, and nlmixr2est would generally try to minimize Depends and Imports packages beyond rxode2 (within reason-- I don't want to reinvent the wheel to prevent a dependency).

Then, for nlmixr2, all other packages that can help such as nlmixr2plot, nlmixr2extra, babelmixr, and xpose.nlmixr (and any others) would fall into a Suggests lists. Those packages would then be checked for and attached (attachNamespace(), I think) in an .onAttach() function in nlmixr2. If the package is not present, then a packageStartupMessage() in .onAttach() would tell the user if they install that other package, then they will get such and such additional functionality.

Please tell me if you think that this makes things too complex. My hope is that it keeps nlmixr2 on CRAN more robustly to other packages changing around it. But, I don't want to make things too complex for us nor confusing at all for the users.

mattfidler commented 2 years ago

I think all of this is covered right now.

billdenney commented 2 years ago

Definitely. Thanks for the hard work on it!