Open dlemfh opened 6 years ago
I highly agree on the point that repetitions on the codebase make the whole package large and slow on the compilation time. But instead of using one single file for a model, making it a combination of separate code files might lead new developers a little hard to understand the overall processes. So, before we start to make it into separate chunks, I think we should determine which chunks can be organized into one submodule-like group.
In the example of rstanarm
, the developers organized them following the code blocks (e.g., functions
, data
, tdata
or transformed data, and so on). Following that kind of approach can be one good way to go.
choiceRT_lba
& choiceRT_lba_single
y_pred
and initializing all elements to -1.
As Ben Goodrich mentioned a year back, it may be worth refactoring the Stan files into smaller chunks that can be shared across multiple Stan files.
It works like this: https://github.com/stan-dev/rstanarm/blob/master/src/stan_files/count.stan Where you use the
#include
notation to include a.stan
file, and while it behaves as an inline replacement, compilation is only done once even if this code chunk is used multiple times, thus increasing the efficiency in the compilation process.Although, for our package, it is true that models of different tasks do not have many overlapping lines of code to go through a major refactoring.
Thus, good places to start would be:
data
block for handling the general_infos (e.g.N
,T
,Tsubj
)gng
task models (where the next model incorporates the previous one's parameters)dd
task modelsprl
task models (where, for instance,ficticious_rp
incorporatesficticious
andrp
)gng
,prl
)generated quantities
block (which seem to contain replicas of the same lines of code that have appeared earlier in the file)