Closed miguelh-nvidia closed 7 months ago
SideFX has been using a customized version of punycode to encode/decode between USD attribute names and Houdini parameter names for a very long time, and it has been quite a successful strategy. I also proposed something like this as a possible alternative to adding true UTF-8 support to USD. So I'm very much onboard with this overall proposal, and can vouch for its usefulness.
Just a few specific comments about the suggested APIs... In many cases there seems to be a typo where you are using "Boostring" instead of "Bootstring". But I would actually suggest eliminate the word "Bootstring" (or "Boostring") from the APIs completely. If this is going to be the Sdf standard for encoding/decoding identifiers, I shouldn't have to know or care that it's using an algorithm called "Bootstring".
I'm also not sure I see the value in returning a std::optional. Wouldn't it be just as easy to return std::string() to indicate an error case? When I saw the API returning std::optional my first thought (and fear, because I think it would be a bad idea) was that the "optional"-ness was going to be used to indicate whether or not the input string had to be modified. Returning and empty std::string would also make these APIs more consistent with the signatures and behavior of TfMakeValidIdentifier.
Thanks for the thoughts, Mark! It's helpful to know you've had success using Bootstring encoding!
But I would actually suggest eliminate the word "Bootstring" from the APIs completely. If this is going to be the Sdf standard for encoding/decoding identifiers, I shouldn't have to know or care that it's using an algorithm called "Bootstring".
I've been an advocate for giving the encoding algorithm a distinguishing name. My thinking is that there are still probably needs for "make valid identifier functions" that aren't bidirectional. For example, replacing symbols and whitespace with _
may be preferable for some users and contexts, and they're willing to give up bidirectionality. My other thought is that as people start to see tn__
appear in paths and logs, it might be helpful for users to have some shorthand for describing them-- "That's a bootstring encoded identifier."
I don't think we're attached to SdfBootstring{Encode,Decode}
though if there's better suggestions.
Just wanted to note the potential need for something like transcoding in Hydra when mapping primvars to GLSL.
Personally, I am leaning to not requiring the name of the algorithm, maybe you got my brain off on the wrong footing with the Freudian slip-ish "boostring". We'll never finish de-boosting!!
PS if we are down to bikeshedding the name, shall we merge this PR in Draft state?
@miguelh-nvidia Are you able to sign the commits and force push? If no, I'll merge as is. Otherwise I'd prefer to attempt to satisfy the automation bot.
@meshula ready!, thanks.
Description of Proposal
TfMakeValidIdentifier
, was used in OpenUSD to convert any identifier into a valid identifier. However, it creates a non-bidirectional relationship, for example, something likeカーテンウォール
would be transformed into________________
.The objective of this proposal is to provide an alternative to
TfMakeValidIdentifier
that can take any identifier (potentially with invalid characters) and transform it into a OpenUSD valid identifier.Contributing