Add transform helper - Githubissues

OriolAbril commented 1 month ago

There used to be a transform argument to plotting functions (maybe some stats too) which wasn't too useful as it only allowed a callable. I propose to instead create a transform helper.

def transform(idata, transform_funcs=None, group="posterior", return_dataset=False):
    if transform_funcs is None:
        transform_funcs = {}
    # loop over variables in `group` and check
    #    1) if there is a transform function mapped to it 
    #    2) if there are transformed values in the unconstrained group use that

Thus, without arguments it would combine the samples in posterior and unconstrained posterior to have all the samples in the unconstrained space. Not 100% sure about how the output should look like. I think it would be useful to allow returning that dataset directly, but it should also return a modified inferencedata/datatree, maybe with the unconstrained_posterior group but no posterior group? or viceversa which might play nicer with defaults in the rest of the functions?

Also, feel free to propose other names, I don't think transform is a particularly good choice as it is too general, maybe get_unconstrained_samples is more clear?

aloctavodia commented 1 month ago

If there are no arguments how does it transform the variables? Are we adding the metadata when creating the InferenceData?

get_unconstrained_samples is a better name if the purpose is to get unconstrained samples. If instead, we are contemplating a more general case of users wanting to perform arbitrary transformations, then transform sounds better.

OriolAbril commented 1 month ago

If there are no arguments how does it transform the variables?

it doesn't "transform" them, it takes the transformed values stored in the unconstrained_posterior group (or relevant unconstrained_group)

aloctavodia commented 1 month ago

makes sense. For some reason I assumed it was trying to get the function from the idata

OriolAbril commented 1 month ago

It is possible to use the attributes of the unconstraned_group dataset (or even in the posterior itself) to store info about the transformations, but that was left for PPLs to handle as they want. We could try adding a function with the same name to pymc that checks the metadata (once we add that info to the converter) and does the transformations if any, otherwise calls this function so the manually provided transformation part is not duplicated

arviz-devs / arviz-base

Add transform helper #8