mattermost / migration-assist

A helper tool to automate MySQL -> Postgres migration for Mattermost
https://docs.mattermost.com/deploy/postgres-migration.html
2 stars 2 forks source link

Warnings During pgloader Migration: Type Casting and Constraint Mismatches in Mattermost Database Migration to PostgreSQL #31

Open ylluminate opened 1 month ago

ylluminate commented 1 month ago

During the pgloader migration.load process, a series of warnings were logged related to type casting mismatches and missing constraints between the Mattermost source database and the target PostgreSQL database.

These warnings indicate differences in data types, such as varchar to text or smallint to boolean, as well as missing foreign key constraints. Should these discrepancies be resolved or addressed prior to fully trusting the migration to PostgreSQL?

Below is the log output in question:

2024-10-25T07:36:06.072040-04:00 WARNING Source column "mattermost"."ChannelBookmarks"."displayname" is casted to type "varchar" which is not the same as "text", the type of current target database column "mattermost"."channelbookmarks"."displayname".
2024-10-25T07:36:06.073040-04:00 WARNING Source column "mattermost"."ChannelBookmarks"."sortorder" is casted to type "bigint" which is not the same as "integer", the type of current target database column "mattermost"."channelbookmarks"."sortorder".
2024-10-25T07:36:06.073040-04:00 WARNING Source column "mattermost"."ChannelBookmarks"."linkurl" is casted to type "varchar" which is not the same as "text", the type of current target database column "mattermost"."channelbookmarks"."linkurl".
2024-10-25T07:36:06.073040-04:00 WARNING Source column "mattermost"."ChannelBookmarks"."imageurl" is casted to type "varchar" which is not the same as "text", the type of current target database column "mattermost"."channelbookmarks"."imageurl".
2024-10-25T07:36:06.073040-04:00 WARNING Source column "mattermost"."ChannelBookmarks"."type" is casted to type "channel_bookmark_type" which is not the same as "mattermost.channel_bookmark_type", the type of current target database column "mattermost"."channelbookmarks"."type".
2024-10-25T07:36:06.073040-04:00 WARNING Source column "mattermost"."ChannelMembers"."notifyprops" is casted to type "text" which is not the same as "jsonb", the type of current target database column "mattermost"."channelmembers"."notifyprops".
2024-10-25T07:36:06.073040-04:00 WARNING Source column "mattermost"."ChannelMembers"."schemeuser" is casted to type "smallint" which is not the same as "boolean", the type of current target database column "mattermost"."channelmembers"."schemeuser".
2024-10-25T07:36:06.073040-04:00 WARNING Source column "mattermost"."ChannelMembers"."schemeadmin" is casted to type "smallint" which is not the same as "boolean", the type of current target database column "mattermost"."channelmembers"."schemeadmin".
2024-10-25T07:36:06.073040-04:00 WARNING Source column "mattermost"."Channels"."type" is casted to type "channel_type" which is not the same as "mattermost.channel_type", the type of current target database column "mattermost"."channels"."type".
2024-10-25T07:36:06.073040-04:00 WARNING Source column "mattermost"."Jobs"."data" is casted to type "text" which is not the same as "jsonb", the type of current target database column "mattermost"."jobs"."data".
2024-10-25T07:36:06.073040-04:00 WARNING Source column "mattermost"."LinkMetadata"."data" is casted to type "text" which is not the same as "jsonb", the type of current target database column "mattermost"."linkmetadata"."data".
2024-10-25T07:36:06.073040-04:00 WARNING Source column "mattermost"."OutgoingOAuthConnections"."oauthtokenurl" is casted to type "varchar" which is not the same as "text", the type of current target database column "mattermost"."outgoingoauthconnections"."oauthtokenurl".
2024-10-25T07:36:06.074041-04:00 WARNING Source column "mattermost"."OutgoingOAuthConnections"."granttype" is casted to type "varchar" which is not the same as "mattermost.outgoingoauthconnections_granttype", the type of current target database column "mattermost"."outgoingoauthconnections"."granttype".
2024-10-25T07:36:06.074041-04:00 WARNING Source column "mattermost"."Posts"."props" is casted to type "text" which is not the same as "jsonb", the type of current target database column "mattermost"."posts"."props".
2024-10-25T07:36:06.074041-04:00 WARNING Source column "mattermost"."Preferences"."value" is casted to type "varchar" which is not the same as "text", the type of current target database column "mattermost"."preferences"."value".
2024-10-25T07:36:06.074041-04:00 WARNING Source column "mattermost"."RetentionIdsForDeletion"."ids" is casted to type "jsonb" which is not the same as "character varying[]", the type of current target database column "mattermost"."retentionidsfordeletion"."ids".
2024-10-25T07:36:06.074041-04:00 WARNING Source column "mattermost"."Sessions"."props" is casted to type "text" which is not the same as "jsonb", the type of current target database column "mattermost"."sessions"."props".
2024-10-25T07:36:06.074041-04:00 WARNING Source column "mattermost"."TeamMembers"."schemeuser" is casted to type "smallint" which is not the same as "boolean", the type of current target database column "mattermost"."teammembers"."schemeuser".
2024-10-25T07:36:06.074041-04:00 WARNING Source column "mattermost"."TeamMembers"."schemeadmin" is casted to type "smallint" which is not the same as "boolean", the type of current target database column "mattermost"."teammembers"."schemeadmin".
2024-10-25T07:36:06.074041-04:00 WARNING Source column "mattermost"."TeamMembers"."schemeguest" is casted to type "smallint" which is not the same as "boolean", the type of current target database column "mattermost"."teammembers"."schemeguest".
2024-10-25T07:36:06.074041-04:00 WARNING Source column "mattermost"."Teams"."type" is casted to type "team_type" which is not the same as "mattermost.team_type", the type of current target database column "mattermost"."teams"."type".
2024-10-25T07:36:06.074041-04:00 WARNING Source column "mattermost"."Threads"."participants" is casted to type "text" which is not the same as "jsonb", the type of current target database column "mattermost"."threads"."participants".
2024-10-25T07:36:06.074041-04:00 WARNING Source column "mattermost"."UploadSessions"."type" is casted to type "upload_session_type" which is not the same as "mattermost.upload_session_type", the type of current target database column "mattermost"."uploadsessions"."type".
2024-10-25T07:36:06.075041-04:00 WARNING Source column "mattermost"."Users"."props" is casted to type "text" which is not the same as "jsonb", the type of current target database column "mattermost"."users"."props".
2024-10-25T07:36:06.075041-04:00 WARNING Source column "mattermost"."Users"."notifyprops" is casted to type "text" which is not the same as "jsonb", the type of current target database column "mattermost"."users"."notifyprops".
2024-10-25T07:36:06.075041-04:00 WARNING Source column "mattermost"."Users"."timezone" is casted to type "text" which is not the same as "jsonb", the type of current target database column "mattermost"."users"."timezone".
2024-10-25T07:36:06.088041-04:00 WARNING PostgreSQL warning: constraint "fk_retentionpolicieschannels_retentionpolicies" of relation "retentionpolicieschannels" does not exist, skipping
2024-10-25T07:36:06.089041-04:00 WARNING PostgreSQL warning: constraint "fk_retentionpoliciesteams_retentionpolicies" of relation "retentionpoliciesteams" does not exist, skipping
isacikgoz commented 4 weeks ago

Should these discrepancies be resolved or addressed prior to fully trusting the migration to PostgreSQL

Hey @ylluminate you don't need to worry about this, I haven't came across any issues regarding to type casting. They are defined in the pgloader load file. After the migration, you can consider using https://github.com/mattermost/dbcmp for 1-1 comparison. Though, you may need to exclude a few tables to avoid confusion as some tables may differ. See: https://github.com/mattermost/dbcmp

ylluminate commented 3 weeks ago

Thanks @isacikgoz - I appreciate your response and the helpful tool recommendation. The dbcmp tool looks particularly useful for verification.

Could you share which tables you typically exclude when using dbcmp? Since you mentioned some tables may differ, it would be helpful to know which ones commonly need exclusion to avoid false positives during comparison.

Also, I notice several JSON-related casts in the warnings (text->jsonb). Have you found these specific conversions to be consistently reliable in practice?

isacikgoz commented 3 weeks ago

I think excluding db_migrations,ir_,focalboard,calls,systems would make sense for setting the bare minimums. Also I'd exclude configurations and related tables configuration_files and db_config_migrations if you are using config in DB.

Also, I notice several JSON-related casts in the warnings (text->jsonb). Have you found these specific conversions to be consistently reliable in practice?

Yes they should be pretty reliable from my experience, they are both text based but jsonb data type adds more utility over JSON types built-in.