We need to solve collation. Because collation can different on different fields/literals (e.g. a table with two columns of different collation or two tables with different collations), I propose that we add collation as a property of the string data types. There are three main string data types:
STRING
VARCHAR
FIXEDCHAR
It seems like we have two options for how to introduce collation:
introduce 3 new types that include a collation property
enhance existing types so they have collation properties but are backwards compatible to avoid migration pain.
I suggest we do the second option. I think this would lead to having a new way to express compound types with default options. For example, maybe we say the following would both be legal:
string => a string type with default collation
string<af_na> => a string type with [af_na collation](https://www.localeplanet.com/icu/af-NA/index.html)
I propose we use the ICU locale names to reference collations with the addition of a pseudo collation called binary. Binary would be the default collation if a parameter is not given.
In function definitions I would be inclined to say that if an argument is specified without a collation, the function applies to all collations (as opposed to what might be interpreted as only the binary collation). This means that string in a plan would mean something slightly different than string in an extension but I think the benefits of backwards compatibility and likely expected behavior would be best with this compromise.
We need to solve collation. Because collation can different on different fields/literals (e.g. a table with two columns of different collation or two tables with different collations), I propose that we add collation as a property of the string data types. There are three main string data types:
It seems like we have two options for how to introduce collation:
I suggest we do the second option. I think this would lead to having a new way to express compound types with default options. For example, maybe we say the following would both be legal:
I propose we use the ICU locale names to reference collations with the addition of a pseudo collation called
binary
. Binary would be the default collation if a parameter is not given.In function definitions I would be inclined to say that if an argument is specified without a collation, the function applies to all collations (as opposed to what might be interpreted as only the binary collation). This means that
string
in a plan would mean something slightly different thanstring
in an extension but I think the benefits of backwards compatibility and likely expected behavior would be best with this compromise.Thoughts?