stan-dev / posterior

The posterior R package
https://mc-stan.org/posterior/
Other
167 stars 24 forks source link

New helper function repair_variable_names #318

Open jgabry opened 1 year ago

jgabry commented 1 year ago

closes #317

Summary

repair_variable_names converts names using periods (theta.1.1) to square brackets (`theta[1,1])

Copyright and Licensing

By submitting this pull request, the copyright holder is agreeing to license the submitted work under the following licenses:

codecov-commenter commented 1 year ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Comparison is base (c6ab463) 95.60% compared to head (0ebe8e9) 95.61%.

:exclamation: Current head 0ebe8e9 differs from pull request most recent head 00d32c6. Consider uploading reports for the commit 00d32c6 to get more accurate results

Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #318 +/- ## ======================================= Coverage 95.60% 95.61% ======================================= Files 47 47 Lines 3686 3694 +8 ======================================= + Hits 3524 3532 +8 Misses 162 162 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

jgabry commented 1 year ago

Maybe repair isn't the best name actually? @avehtari suggested convert in https://github.com/stan-dev/posterior/issues/317#issuecomment-1816897197. I'm open to other suggestions too.

paul-buerkner commented 1 year ago

I like "repair". Having the dots instead of [] is kind of a thing we don't really want. So we are repairing them to look reasonable again. But I also don't have strong opinions. So completely fine by me if we call it differently.

mjskay commented 1 year ago

Re: function name: is the intention to create a function that specifically converts from Stan names, or is the intention that this function might (in the future) support other non-posterior-friendly naming conventions? I think that should influence the naming; e.g. is this "convert_from_stan_names" (though perhaps less verbose) or the more generic "repair_variable_names" or "convert_variable_names"? Or is it "repair_dotted_indices" or "repair_dotted_index_names" or something like that?

jgabry commented 1 year ago

Good point @mjskay. If this will only ever be used to convert Stan names we could include that in the function name. What do others think?

avehtari commented 1 year ago

I think it would be best to make the function specific for Stan naming conventions, and then add other functions if we add support for other packages. The name "convert_from_stan_names" would be then very descriptive, and I don't know how to make it shorter

Furthermore, in #317 @WardBrian wrote

Complex variables use .real and .imag as their CSV names in Stan. So something like a complex_vector[2] foo has names foo.1.real,foo.1.imag,foo.2.real,foo.2.imag

Tuples use : to separate "slots" in a tuple. So a tuple(real, int) a is printed as a:1,a:2. This stacks naturally with the . for arrays of tuples/tuples of arrays.

If we want to support these, too, I assume that we would like to have something like "foo.1.real" -> "foo_real[1]" and "a:1" -> "a_1"

And the natural complex type support is a separate issue #319 and PR #321

paul-buerkner commented 1 year ago

I agree with Jonah and Aki.

WardBrian commented 12 months ago

If you did want to support tuple names, this essentially just needs to split on : and loop over the resulting parts. See the code that does something similar in Stan: https://github.com/stan-dev/stan/blob/667d5b2d7c5ea1072011919a1acd76a9d3061d01/src/stan/io/stan_csv_reader.hpp#L16