nimble-dev / nimble

The base NIMBLE package for R
http://R-nimble.org
BSD 3-Clause "New" or "Revised" License
156 stars 23 forks source link

added `getMixedDataNodeNames` method for models #1489

Open danielturek opened 5 days ago

danielturek commented 5 days ago

This adds a new method getMixedDataNodeNames for model objects. Prior to this, we have no functionality to test for, or to retrieve node names which are partially observed: multivariate nodes, which are partially, but not entirely, data.

The core model function isDataFromGraphID was modified, to add an additional argument includeMixed. The former behaviour of model$isDataFromGraphID exclusively returned TRUE or FALSE, giving only 2 possible options. When the new argument includeMixed = TRUE, the return values of model$isDataFromGraphID are now 0 (not data; formerly FALSE), 1 (fully data; formerly TRUE), or 2 (partially data; formerly, this would have been TRUE as well, but that result is somewhat misleading).

This change to isDataFromGraphID is protected behind the new argument includeMixed, so no previous use of this function will be affected.

Then, a new method of the model class is added getMixedDataNodeNames. @perrydv care was taken to do this with efficiency in mind, and the logic takes place at the level of model declarations, until individual nodes need to be inspected (for mixed data cases) - at which point inspection takes place as efficiently as possible, following the existing logic in isDataFromGraphID.

I took this design approach, rather than trying to add arguments to model$isData, or adding arguments to model$getNodeNames, or rather than adding a function such as model$isMixedData, for a few reasons. These could be discussed, but I envision the main function of this inside the MCMC, to assess (and handle) if there are any mixedData nodes in a model, so this function was written to return those node names. Furthermore, adding an argument to model$isData would not actually be useful, because at present any nodes which contain any data are omitted from consideration, so this approach would require applying the modified model$isData in an unnatural manner. Furthermore, the approach used keeps this functionality fully separated from previous functionality, which felt like a good way to go.

danielturek commented 5 days ago

Testing passed; this is not directly user facing, and I'm hoping to merge this in.