Open Lestropie opened 2 years ago
Deferring comment thread in #90 here.
Response to @arokem's comment is kind of two separate parts:
My main concern is "shoe-horning" the distinction between bootstrap realisations vs. aggregate / non-bootstrapped fit into a sub-optimal location. I'm not a fan of distinguishing between these within "_param-
", as the quantitative parameter being encoded is identical between the two files that require disambiguation. I can't say for certainty yet where I think that distinction should happen, but I think there's multiple candidates that would be preferable to that one.
Your point about the mechanism of aggregation leans into the complexity of #61 introducing a "_stat-
" entity.
On the surface, this seems an elegant way to faithfully encode the fact that some parameter is being aggregated across some data dimension. Notably, this would not only be applicable to bootstrapping; eg. often the mean intensity of an fMRI time series is generated, which is most faithfully described as computing the mean statistic along axis 3.
This is however:
So my progress kind of stalled here.
This statement opens up more complexity than first realised:
(median bs? mean bs? A run on the intact sample?).
Imagine two different model fits. In the first, it performs bootstrapping as per bedpostx
, and then computes the mean fibre orientations across the realisations. In the second, there is no bootstrapping whatsoever; it just does a max a posteriori fit to the empirical data, yielding one set of fibre orientations. In terms of data content, these two are identical, however the ways in which they differ from the bootstrap realisations differs: the first is a derivative of the model fit, whereas the second is a different fitting procedure. In my prior structure, I'd have described this as the first being a model-derived parameter, based on a mean statistic computation across the model fit parameters of the bootstrapped model, and the second as being a distinct model fit from the first, with the difference in the two model fits being the use of bootstrapping, and this would most likely be encoded using different values for the "_model-
" entity. I'm not sure how to disambiguate these given the structure of #90.
Handling representation of model bootstrapping (e.g.
bedpostx
) requires a lot more consideration than what is currently in the specification. I think that we should consider stripping out what's there, implement support for the mean outputs ofbedpostx
, and then once that's achieved we should then try re-inserting bootstrapping support as an explicit PR.