Closed Bisaloo closed 1 year ago
We have discussed this today and this is where we landed. Below is the current output of fit_seromodel()
and how we want to deal with each element:
n_iters
, n_thin
, foi_model
, delta
, m_treed
are function input and should not be included in the output. We should instead teach users to always keep the script producing the object alongside the object itself.n_warmup
, exposure_years
, exposure_ages
, stan_data
can be directly computer from inputs with a single line of code each. If we believe users might want to access this data (e.g., for debugging), we should instead create an intermediate exported function to compute them but it shouldn't be part of fit_seromodel()
output by defaultloo_fit
, foi_cent_est
, foi_post_s
can be directly computed from fit
with a single line of code each. Users can either compute them by themselves or they can be included in downstream post-processing functions that act on the output of fit_seromodel()
This leaves us with fit
, which is the output of rstan::sampling()
and is an object of class stanfit
. The seems like a good candidate for output because it means we benefit from stanfit
methods out of the box. If we want to provide custom methods, it is also possible to create a subclass inheriting from stanfit
.
I agree -- returning a stanfit
object makes most sense since Stan is a well-used and well-developed package that many people have experience working with.
I just merged @ntorresd 's PR #109 in which the object returned by fit_seromodel()
is simpler. @ntorresd 's rationale of not returning a plain stanfit
object is to reduce the probability that the user provides the wrong serodata
to other functions that require both serodata
and their corresponding stanfit
object (a serodata
that was not the input to the generated stanfit
object)
@Bisaloo @ben18785 do you agree with the above argument? or would you delegate into the package's user the responsibility to ensure the correct parameters are passed to other functions that use both serodata
and the stanfit
object.
I do no heavily lean towards any option, so I would like your feedback.
would you delegate into the package's user the responsibility to ensure the correct parameters are passed to other functions that use both serodata and the stanfit object.
I think it's the user responsibility. It's important to teach users to use reproducible scripts, and not carry their data from one session to the other, losing the object provenance. Code workarounds to try and handle this to the user add complexity to the code, thus reducing maintainability, and paradoxically makes the outputs more difficult to manipulate. And even with this, you can't completely protect the user from losing the object provenance. Hence why I believe that teaching good practices is the only way to solve this issue.
If we had a plain stanfit
object, users would be able to directly use already implemented methods (plot()
, summary()
). Having a more complex object as in #109 will require the users to do plot(x$stanfit)
or force you to implement your own methods.
Currently, the output of
fit_seromodel()
is quite large and contains some information that could easily be removed.This would allow us to: