Since we will initiate work on R/python/* interfaces this summer, I wanted to bring up this key issue: maintainability. It has to be in the forefront of our thinking when/if we create a multilingual architecture.
Some quotes form it: "Before it is “in Stan” in a way that you can use it, we have to repeat this entire process for integration into CmdStan"; "Then when it’s merged into CmdStan, we repeat the whole process again for CmdStanPy."; "Then when CmdStanPy is done, everything we did for CmdStanPy gets done again for CmdStanR, but the interfaces might change slightly to match R coding and interface conventions."; "by my count we need a pull request and code review for: Stan, CmdStan, CmdStan docs, CmdStanPy, and CmdStanR. It’ll require at least me (basic feature), Mitzi (CmdStan and CmdStanPy) and Jonah (CmdStanR) to be involved, plus three additional people to review the pull requests."
If not done carefully in Pigeons, this will stall the project; we don't have the luxury of Stan's large staff.
Concretely, I think we want to avoid a situation where in order for someone to push an update (e.g., a new algo as in the post), we need (1) synchronized actions in many repos...; and (2) ...especially when they are in different languages, as the set of people that are expert in all languages involved will necessarily be smaller.
What this means is that there are going to be trade-offs between usability of the interface (ux/feature completeness) and maintainability. I think it is OK to have an interface where you can do 80% of Pigeons' features/most frequently used ones and for extra customization you have to use Julia. Maybe there is some programming tricks that could help here, e.g. a macro superseding @kwdef which on top of @kwdef's behaviour, also generates key components of the python and R interfaces and documentation. Potentially, could the whole pigeonr and pygeons repos be generated and deployed as part of the current deployment pipeline? Alternatively, the macro could generate a command line interface plus some JSON specifying command line arguments, documentation, etc, which are consumed by R and python interface. But really we should keep the number of repos that we have to maintain very low at this stage.
Since we will initiate work on R/python/* interfaces this summer, I wanted to bring up this key issue: maintainability. It has to be in the forefront of our thinking when/if we create a multilingual architecture.
I was reading the following post from a Stan developer and I think it is an important read to prepare for this: https://statmodeling.stat.columbia.edu/2023/02/08/implementing-laplace-approximation-in-stan-whats-happening-under-the-hood/
Some quotes form it: "Before it is “in Stan” in a way that you can use it, we have to repeat this entire process for integration into CmdStan"; "Then when it’s merged into CmdStan, we repeat the whole process again for CmdStanPy."; "Then when CmdStanPy is done, everything we did for CmdStanPy gets done again for CmdStanR, but the interfaces might change slightly to match R coding and interface conventions."; "by my count we need a pull request and code review for: Stan, CmdStan, CmdStan docs, CmdStanPy, and CmdStanR. It’ll require at least me (basic feature), Mitzi (CmdStan and CmdStanPy) and Jonah (CmdStanR) to be involved, plus three additional people to review the pull requests."
If not done carefully in Pigeons, this will stall the project; we don't have the luxury of Stan's large staff.
Concretely, I think we want to avoid a situation where in order for someone to push an update (e.g., a new algo as in the post), we need (1) synchronized actions in many repos...; and (2) ...especially when they are in different languages, as the set of people that are expert in all languages involved will necessarily be smaller.
What this means is that there are going to be trade-offs between usability of the interface (ux/feature completeness) and maintainability. I think it is OK to have an interface where you can do 80% of Pigeons' features/most frequently used ones and for extra customization you have to use Julia. Maybe there is some programming tricks that could help here, e.g. a macro superseding @kwdef which on top of @kwdef's behaviour, also generates key components of the python and R interfaces and documentation. Potentially, could the whole pigeonr and pygeons repos be generated and deployed as part of the current deployment pipeline? Alternatively, the macro could generate a command line interface plus some JSON specifying command line arguments, documentation, etc, which are consumed by R and python interface. But really we should keep the number of repos that we have to maintain very low at this stage.