Closed mortonjt closed 3 years ago
Another thought on the save / load serialization -- perhaps we don't need to be able to serialize the CmdStanMCMC
model itself; we just need to serialize the parameters required to initialize a CmdStanMCMC
model. This can be easily done by saving / loading json files.
As discussed earlier, I think the most reasonable step forward is if we don't try to handle parallelism, but keep the doors open for users to use their own favorite parallelism tool (i.e. multiprocessing, joblib, dask, snakemake, jobarrays, disbatch ...)
I'm thinking that introducing some form of
ModelIterator
abstraction would be key for enabling this when fitting multiple features in an embarrassingly parallel fashion. Off the cuff, I'd imagine that the architecture would look something like thisFrom there, if the user really wants parallelism, they can code it up themselves -- we can provide tutorials on how to do this, but birdman won't be providing any support on parallelism. Below are some code skeletons on what these tutorials could look like.
I may have left out some details, feel free to follow up.