Open czaefferer opened 2 years ago
Upon further investigation: According to Protocol::HandshakeResponse the client may send one byte for the charset ID. It also notes:
This “character set” value is really a collation ID but implies the character set; see the Protocol::CharacterSet description.
UTF8MB4_DE_PB_0900_AI_CI
has the collation ID 256
, which is two bytes. Using that as the charset causes the RangeError. But since only really the charset implied by the collation is used, I am now using: charset: 'UTF8MB4'
in the connection parameters. This implies UTF8MB4_GENERAL_CI
in the connector, which then implies UTF8MB4
on the server, which is really the thing I need.
With this I have a workaround, but I guess for the handshake in the connector there should be a check in some way. The easiest thing to do might be to map every collation with ID > 255 to 255.
However this might be confusing in some situations: I believe if there is (for another reason) an "Illegal mix of collations" error, the collation given in the handshake will be shown in the error message. If that was changed by the handshake, a user might be confused since the collation shown in the error is different from the one configured, even though only the charset implied by the collation(s) is relevant. At least this was the behaviour I've seen, but I'm not sure about this part, and I do not have a solution.
Using any of the following list items as a charset will cause the error reported:
exports.UTF8MB4_DE_PB_0900_AI_CI = 256; exports.UTF8MB4_IS_0900_AI_CI = 257; exports.UTF8MB4_LV_0900_AI_CI = 258; exports.UTF8MB4_RO_0900_AI_CI = 259; exports.UTF8MB4_SL_0900_AI_CI = 260; exports.UTF8MB4_PL_0900_AI_CI = 261; exports.UTF8MB4_ET_0900_AI_CI = 262; exports.UTF8MB4_ES_0900_AI_CI = 263; exports.UTF8MB4_SV_0900_AI_CI = 264; exports.UTF8MB4_TR_0900_AI_CI = 265; exports.UTF8MB4_CS_0900_AI_CI = 266; exports.UTF8MB4_DA_0900_AI_CI = 267; exports.UTF8MB4_LT_0900_AI_CI = 268; exports.UTF8MB4_SK_0900_AI_CI = 269; exports.UTF8MB4_ES_TRAD_0900_AI_CI = 270; exports.UTF8MB4_LA_0900_AI_CI = 271; exports.UTF8MB4_EO_0900_AI_CI = 273; exports.UTF8MB4_HU_0900_AI_CI = 274; exports.UTF8MB4_HR_0900_AI_CI = 275; exports.UTF8MB4_VI_0900_AI_CI = 277; exports.UTF8MB4_0900_AS_CS = 278; exports.UTF8MB4_DE_PB_0900_AS_CS = 279; exports.UTF8MB4_IS_0900_AS_CS = 280; exports.UTF8MB4_LV_0900_AS_CS = 281; exports.UTF8MB4_RO_0900_AS_CS = 282; exports.UTF8MB4_SL_0900_AS_CS = 283; exports.UTF8MB4_PL_0900_AS_CS = 284; exports.UTF8MB4_ET_0900_AS_CS = 285; exports.UTF8MB4_ES_0900_AS_CS = 286; exports.UTF8MB4_SV_0900_AS_CS = 287; exports.UTF8MB4_TR_0900_AS_CS = 288; exports.UTF8MB4_CS_0900_AS_CS = 289; exports.UTF8MB4_DA_0900_AS_CS = 290; exports.UTF8MB4_LT_0900_AS_CS = 291; exports.UTF8MB4_SK_0900_AS_CS = 292; exports.UTF8MB4_ES_TRAD_0900_AS_CS = 293; exports.UTF8MB4_LA_0900_AS_CS = 294; exports.UTF8MB4_EO_0900_AS_CS = 296; exports.UTF8MB4_HU_0900_AS_CS = 297; exports.UTF8MB4_HR_0900_AS_CS = 298; exports.UTF8MB4_VI_0900_AS_CS = 300; exports.UTF8MB4_JA_0900_AS_CS = 303; exports.UTF8MB4_JA_0900_AS_CS_KS = 304; exports.UTF8MB4_0900_AS_CI = 305; exports.UTF8MB4_RU_0900_AI_CI = 306; exports.UTF8MB4_RU_0900_AS_CS = 307; exports.UTF8MB4_ZH_0900_AS_CS = 308;
I need to connect to a MySQL 8 database with the charset UTF8MB4 and collation UTF8MB4_DE_PB_0900_AI_CI, but instead of a connection I get a RangeError.
The source seems to be the charset/collation: UTF8MB4_DE_PB_0900_AI_CI has the internal number "256", and the RangeError says the value must be "<= 255" but received "256". Further tests show, collations with the internal number up to 255 work fine, 256 and above do not.
My MySQL has the version 8.0.23 and is hosted on AWS RDS with their default option- and parameter-groups, accessed through a tunnel through a bastion host. For "mysql2" I'm using version "2.3.3". The simplest example to reproduce this is:
Executing this will result in the error below. Changing the charset for example to "UTF8MB4_0900_AI_CI" (internal number 255) will produce a connection and output the content of the query.