kaihendry commented 5 years ago

ERROR 1267 (HY000): Illegal mix of collations (utf8mb4_unicode_520_ci,IMPLICIT) and (utf8mb4_general_ci,IMPLICIT) for operation '='

On dev there is a utf8_general_ci on mefe_failedUnitCreations which I think is wrong.

[hendry@t480s tmp]$ grep -B1 utf8_general_ci prod.txt
           Name: lmb_units_10july18
      Collation: utf8_general_ci
--
           Name: map_ipi_hmlet_uneet_details
      Collation: utf8_general_ci
--
           Name: ut_data_to_create_units_legacy_before_3_3
      Collation: utf8_general_ci

kaihendry commented 5 years ago

I think it's actually more to do with the procedure setup than the tables now.

SHOW CREATE PROCEDURE

franck-boullier commented 5 years ago

I'm assuming that this is for the unee_t_enterprise database

This should be fixed now. @kaihendry can you check and confirm before we can close this issue.

franck-boullier commented 5 years ago

mefe_failedUnitCreations does not belong to the BZ database, it was a f... up I made in the DEV staging (thanks for pointing that out). I've fixed the issue by deleting the table mefe_failedUnitCreations from the BZ database.

kaihendry commented 5 years ago

This is bugzilla, not enterprise.

Database Collation seems to be mixed between utf8_general_ci & utf8mb4_unicode_520_ci

franck-boullier commented 5 years ago

@kaihendry AFAICT this is a RDS configuration issue ---> not related to the database schema ---> there is nothing I can do here

franck-boullier commented 5 years ago

@kaihendry see these links for more context:

make sure to use the correct values when you correct the RDS configuration though: we use

Character set: utf8mb4
Collation: utf8mb4_unicode_520_ci

kaihendry commented 5 years ago

Dev and prod are configured the same way. You can confirm this yourself with the dbcheck describe API in Postman. I don't want to try force things with RDS parameters and much prefer to leave them by defaults.

Dev appears correct. Production not. Again see the Unicode API in dbcheck in Postman.

https://media.dev.unee-t.com/2019-04-25/aws-support.mp4 - screencast I made showing they have the same config
https://media.dev.unee-t.com/2019-04-25/latin1.mp4 - i'm starting to thing the stuff in the mysql database is cause

Furthermore I am bit concerned about going with utf8mb4_unicode_520_ci since best practice appears to be utf8mb4_general_ci or is it utf8mb4_unicode_ci. I am utterly confused.

I would be very careful to change this though, since I have been struggling for hours to work out howto switch my collation_connection from the default utf8mb4_generalci to _utf8mb4_unicode_520ci in cli. And if it's not aligned, mysql just gives up.

franck-boullier commented 5 years ago

Dev and prod are configured the same way. You can confirm this yourself with the dbcheck describe API in Postman

This is obviously not true since PROD is working fine and DEV/Staging is not. There must some obscure configuration parameter that you're missing somewhere...

franck-boullier commented 5 years ago

Dev appears correct. Production not.

This depends on your defintion of "correct". For me:

PROD is working so it is "correct" (to some extend - putting aside the collation/character set reported by the Postman)
DEV/Staging is not working so it is NOT "correct"

Is it possible the that Postman check is misleading us?

franck-boullier commented 5 years ago

best practice appears to be utf8mb4_general_ci or is it utf8mb4_unicode_ci.

Just to make sure all information is collated in this issue: We use

Character set: utf8mb4
Collation: utf8mb4_unicode_520_ci because dylan recommended that we use those to make sure we have emoji compatibility See https://github.com/unee-t/bz-database/issues/110#issuecomment-461679326

Unicode_520 stands for v5.20 of the norm (the most recent one). https://stackoverflow.com/questions/18904883/what-is-the-difference-between-utf8-unicode-ci-and-utf8-unicode-520-ci

utf8mb4 should be the collation used everywhere: see https://medium.com/@adamhooper/in-mysql-never-use-utf8-use-utf8mb4-11761243e434

franck-boullier commented 5 years ago

Also @kaihendry see https://github.com/unee-t/bz-database/blob/101677db8b7cc86e7c17a7515e478fc3f6cce6a0/db%20upgrade/upgrade_unee-t_v3.30_to_v4.31.sql for detailed explanation of the upgrade we made to the BZ db (all environments) back in Feb 2019

kaihendry commented 5 years ago

I think it's the top level database configuration issue now.

[hendry@t480s db]$ echo 'SELECT @@character_set_database,@@collation_database;' | ./connect.sh
Connecting to DEV auroradb.dev.unee-t.com
mysql: [Warning] Using a password on the command line interface can be insecure.
utf8mb4 utf8mb4_unicode_ci
[hendry@t480s db]$ echo 'SELECT @@character_set_database,@@collation_database;' | ./connect.sh -p
PRODUCTION
Connecting to PROD auroradb.unee-t.com
mysql: [Warning] Using a password on the command line interface can be insecure.
utf8    utf8_general_ci

I think both values IIUC #110 correctly, should be utf8mb4_unicode_520_ci

When this is set, this becomes the default for everything unless the collation is specified otherwise in the table/schemas etc, which is should not be.

franck-boullier commented 5 years ago

@kaihendry can you alter you golang script to make sure that all the variables sent to the BZ database are using the correct collation and character set

In mySQL the syntax to make sure that the string 'foobar' is using the correct character set and collation is

SELECT (_utf8mb4'foobar' COLLATE utf8mb4_unicode_520_ci);

---> this should make sure that when you sent the string 'foobar' down to the BZ database it will be understood as a string using Character Set utf8_mb4 and collation utf8mb4_unicode_520_ci

See https://dev.mysql.com/doc/refman/8.0/en/charset-literal.html

This might be the root cause why create unit is slow with lambdas: See https://lowleveldesign.org/2013/07/19/diagnosing-collation-issue-mysql-stored-procedure/ for more context

When index key values have different collations than parameters in the where condition, MySql is not able to use index and needs to perform a full scan.

franck-boullier commented 5 years ago

More information

The prod

is relatively fast when creating units
has consistent Collation as per @kaihendry 's tests

The Dev:

Is significantly slower than the PROD
Has inconsistent Collation as per @kaihendry s' tests

franck-boullier commented 5 years ago

@nbiton and @hendry I just had a brainwave...

The performance issue we are witnessing in the link between MEFE and the BZ database looks more an more like related to collation issues.

One reason why collation matters is when you do search on strings / varchar, the collation must be the same everywhere in order for MySQL to be able to use indexes.

Current situation:

Today the MEFE is passing the following information to the BZ database:

For unit creation: the mefe_unit_id. This is a VARCHAR ---> collation matters
For adding a user to a role in a unit: the mefe_invitation_id. This is a VARCHAR ---> collation matters

The idea:

IF instead of a VARCHAR/String the MEFE was sending an INT instead THEN collation would NOT matter at all

@nbiton, @kaihendry how complex will it be to do this change: 1- send a unique INT instead of a VARCHAR for the mefe_unit_id
2- send a unique INT instead of a VARCHAR for the mefe_invitation_id ?

We probably need to double check a few things but that might solve all our collation/performance issues...

franck-boullier commented 5 years ago

I explored the above idea further. Here are my findings so far (for Unit creation)

How it works today

Upstream from the MySQL:

Step 1- MEFE sends a JSON call out with the payload needed to create the unit.
Step 2- The procedure maintained here
- 2.1- Inserts the payload on a specific MySQL table: ut_data_to_create_units. See the code here.
- 2.2- Once the record is successfully inserted in the table ut_data_to_create_units, calls the procedure unit_create_with_dummy_users to start the actual unit creation process in the SQL database.

In the BZ database:

Once the procedure unit_create_with_dummy_users is called with the correct variables

@mefe_unit_id
@environment

then the unit creation process starts

The crux of the issue:

As of today mefe_unit_id is a String/VARCHAR.

Depending on many parameters that are apparently beyond our control this String/VARCHAR is passed with a charset/collation that might be different from the default charset/collation that MySQL is expecting (MySQL is expecting utf8_mb4 and utf8_mb4_unicode_520_ci).

When this happens (the charset/collation are different) then MySQL is not able to use its indexes to optimize the search and needs to perform a full scan of the table. ---> this will get worse and worse as the number of units we have created in Unee-T grows

Why do we need that variable?

Reason 1: to get the data we need to create the unit:

The first step in the unit creation process is to find the actual details of the unit that we need to create. These details are stored in the table ut_data_to_create_units as explained above.

The main source of our headaches IMO is the following line in the procedure unit_create_with_dummy_users:

    # What is the record that we need to use to create the objects in BZ?
        SET @unit_reference_for_import = (SELECT `id_unit_to_create` FROM `ut_data_to_create_units` WHERE `mefe_unit_id` = @mefe_unit_id);

Reason 2: Send the BZ unit id back to the MEFE:

After the unit is created in the BZ db, we make another call to the BZ database to get the newly created BZ product ID for the unit. MEFE needs that id to be able to find bugs and other stuff.

Here again we use the MEFE Unit Id, a string/VARCHAR. The query is here

SELECT product_id FROM ut_data_to_create_units WHERE mefe_unit_id=?", MefeUnitID

Moving forward:

We have 2 options at this stage:

Option 1: use an INT to do the BZ SELECT:

In that scenario we alter the Payload from the MEFE so that MEFE sends BOTH:

the MEFE unit ID as a sting/varchar same as before (for easy debug/logging/tracking) AND
a unique INT that MySQL will use to do the various SELECT described above.

The queries will become

# What is the record that we need to use to create the objects in BZ?
        SET @unit_reference_for_import = (SELECT `id_unit_to_create` FROM `ut_data_to_create_units` WHERE `mefe_unit_id_int_value` = @mefe_unit_id_int_value);

and

SELECT product_id FROM ut_data_to_create_units WHERE mefe_unit_id_int_value=?", MefeUnitIdIntValue

and we can forget all our worries about charset/collation issues 🎉

Option 2: We find an alternative solution (unknown/unclear at this stage)

@kaihendry and @nbiton if anyone has an idea I'm listening...

Conclusion:

@kaihendry and @nbiton any reason why we can NOT implement Option 1 described above?

franck-boullier commented 5 years ago

What needs to happen:

Invite user to a unit:

BZ DB:

Must run DB schema v4.34.0 or above. See https://github.com/unee-t/bz-database/pull/129

Routine to pass the info from MEFE to BZ Db

We need to run a version of the invite scripts which

can handle the updated payload from the MEFE.
Runs the SELECT queries based on the INT version of the MEFE invitation id

MEFE

MEFE needs to ba able to send a value mefeInvitationIdIntValue (an INT - must be unique) as part of the payload

Create a new unit:

BZ DB:

Must run DB schema v4.34.0 or above. See https://github.com/unee-t/bz-database/pull/129

Routine to pass the info from MEFE to BZ Db

We need to run a version of the unit creation scripts which

can handle the updated payload from the MEFE.
Runs the SELECT queries based on the INT version of the MEFE unit id

MEFE

MEFE needs to be able to send a value mefeUnitIdIntValue (an INT - must be unique) as part of the payload

Status Update:

We need to make sure that all the tests are passed before deploying in the PROD.

Deployment:

BZ database:

Latest version of DB schema v4.34 has been implemented in the DEV/Staging and in the DEMO environment.

Unit creation ---> Deployed in
- DEV/Staging
- DEMO
Invite user to unit ---> Deployed in
- DEV/Staging
- DEMO

Routine to pass the information from the MEFE to the BZ DB

Unit creation ---> Deployed
- DEV/Staging
Invite user to unit ---> Pending

MEFE

Unit Creations ---> Pending
Invite a user to a unit ---> Pending

franck-boullier commented 5 years ago

Implemented in all envos, no errors, speed for:

unit creation: 0.067sec
assign user to a role in a unit (with permissions): 0.087sec

kaihendry commented 5 years ago

I needed to add &collation=utf8mb4_unicode_520_ci to Go code's collation SQL connection string to avoid:

Error 1267: Illegal mix of collations (utf8mb4_unicode_520_ci,IMPLICIT) and (utf8mb4_general_ci,IMPLICIT) for operation '='"}

I am a bit surprised I need to do this. In my tests I didn't ExecSQL so perhaps this is where collation needs to be explicit. It's a bit annoying that I have not found a way to test the collation connections from the database are correct.

https://stackoverflow.com/questions/55998671/connecting-with-wrong-collation-to-a-mysql-server

kaihendry commented 5 years ago

I didn't realize were using the newly introduced int as a counter. :scream_cat: As mentioned in slack, a counter for co-coordinating on a micro service architecture such as ours is far too error prone. And if you didn't use the int as a counter, it's still far too easy to clash compared to a string. Sorry, I should have spotted what was happening here.

A string as id should be straightforward to index and lookup. We should investigate why that wasn't fast and fix that instead of using this integer in my opinion. There are other issues that muddied the waters on this bug too, like collation settings when exec-ing SQL being problematic. They appear fixed.

Perhaps we should create a new issue for making unique string IDs fast to look up, assuming that is the problem RE https://github.com/unee-t/bz-database/issues/128#issuecomment-488227233

unee-t / bz-database

Inconsistent collations #128

More information

The prod

The Dev:

Current situation:

The idea:

How it works today

Upstream from the MySQL:

In the BZ database:

The crux of the issue:

Why do we need that variable?

Reason 1: to get the data we need to create the unit:

Reason 2: Send the BZ unit id back to the MEFE:

Moving forward:

Option 1: use an INT to do the BZ SELECT:

Option 2: We find an alternative solution (unknown/unclear at this stage)

Conclusion:

What needs to happen:

Invite user to a unit:

BZ DB:

Routine to pass the info from MEFE to BZ Db

MEFE

Create a new unit:

BZ DB:

Routine to pass the info from MEFE to BZ Db

MEFE

Status Update:

Deployment:

BZ database:

Routine to pass the information from the MEFE to the BZ DB

MEFE