AnatolyUss / nmig

NMIG is a database migration tool, written in Node.js and highly inspired by FromMySqlToPostgreSql.
GNU General Public License v3.0
451 stars 83 forks source link

Invalid hexadecimal digit #128

Open sualko opened 3 months ago

sualko commented 3 months ago

Thanks for this nice tool. It's is doing an amazing job for large parts. We encountered only issues with binary data.

[114] STATEMENT:  UPDATE public."asset" SET "labelIcon" = DECODE(ENCODE("labelIcon", 'escape'), 'hex');
[99] ERROR:  invalid hexadecimal digit: "%"

The column is used to store small png icons.

`labelIcon` longblob NOT NULL,

A truncated example output is:

�PNG
�
IHDR  ,      �   �PLTE���
...

Or as hex:

0x89504E470D0A1A0A0000000D494844520000012C00000
...

I found https://github.com/AnatolyUss/nmig/issues/52, but I did not find any text in that column. Everything looks like binary data.

MySQL server is version 8 and PostgreSQL was 13 (because first I tried pgloader).

Do you have any idea?

sualko commented 3 months ago

It happens also on other columns with e.g. fido keys.

[103] ERROR:  invalid hexadecimal digit: "\"
[103] STATEMENT:  UPDATE public."fido_authenticator" SET "credentialPublicKey" = DECODE(ENCODE("credentialPublicKey", 'escape'), 'hex');
AnatolyUss commented 2 months ago

Hello Klaus, Thanks for the feedback!

Honestly speaking, blob migration was, and is, an issue. It works properly only with blobs containing text data.

sualko commented 2 months ago

Do you have any idea how we can mitigate this issue?

sualko commented 2 months ago

@AnatolyUss Can you help me a bit to debug this issue? In particular what the binary decoder does. What's the purpose of it? It seams that it will try to decode binary strings with hex. How should this work? https://github.com/AnatolyUss/nmig/blob/3b31651cae0355d1be3de0567816425715705734/src/BinaryDataDecoder.ts#L64-L65

AnatolyUss commented 2 months ago

Binary decoder, decodes binary data from textual representation after completing COPY streams. It is relevant for any sort of blob column. Actually, if the "slow" table contains blob column/s - then it is perfectly explicable why the migration takes so much time. For now, nmig is very slow when there're blobs to migrate.

More importantly, if those blobs contain a non-textual data - the errors/inconsistencies in resulting PG BYTEA columns are likely. If this is the case - I suggest to use another tool. Otherwise, nmig should be fine.

In general, tables without blob columns usually get migrated quite fast.

On Mon, Aug 5, 2024 at 11:14 AM Klaus @.***> wrote:

@AnatolyUss https://github.com/AnatolyUss Can you help me a bit to debug this issue? In particular what the binary decoder does. What's the purpose of it? It seams that it will try to decode binary strings with hex. How should this work? https://github.com/AnatolyUss/nmig/blob/3b31651cae0355d1be3de0567816425715705734/src/BinaryDataDecoder.ts#L64-L65

— Reply to this email directly, view it on GitHub https://github.com/AnatolyUss/nmig/issues/128#issuecomment-2268449465, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADHR7NZJB4XMGZ5ASGPVWO3ZP4X7VAVCNFSM6AAAAABK36ZDEOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDENRYGQ2DSNBWGU . You are receiving this because you were mentioned.Message ID: @.***>