Closed rafaelcalixto closed 1 year ago
While correct, it's also very crude solution. I would rather translate it correctly whenever possible (i.e. has explicit encoding parameter for decode) and use the 'replace' as fall-back solution if this would fail. So I had to reject this patch "as is".
A 'system_charset' parameter on fdb.Connection that would be passed to exception_from_status() seems appropriate. It would also need support in connect()/create_database() (although these already have a lot of parameters) that would set it on created Connection instance and would also use it directly to avoid use of fallback on errors raised in connect/create_database.
There is an example code to connect to the database with charset different your system charset. It's very simple.
But! Once i worked with corrupted database, and some rows raises an UTF8 decode exception. I solved this by setting charset='WIN1252' instead of 'WIN1251'. So, instead of strings i got an abracadabra, and then i decoded strings manually.
That's why i think that core patch idea is very good.
As long as I see, enough log is coming before the error ('utf-8' codec can't decode byte 0xc4 in position 0: unexpected end of data) and the only thing is the single byte problem; a single byte coming at the end of error reading.... so this fix is OK if you ask me, or must change error reading parts and strugle why it returns a single byte that cannot be decoded to UTF8
Thank you for this pull request, saved me a lot of time; replace can be used also in line 489 of the same file
My System works on UTF-8, but my Firebird send me messages in LATIN-1. So I added a handler for errors in the encode function for the application don't break with those messages.