Embarcadero / IB.NETDataProvider

InterBase database .NET Data Provider
Other
12 stars 7 forks source link

Arithmetic Exception Cannot transliterate character between character sets #5

Closed tramesh2009 closed 2 years ago

tramesh2009 commented 3 years ago

Hi,

I was installed ADO.Net provider and tried to save German Umlaut characters. It was saved successfully. But it could not retrieve from DB. There is showing arithmetic exception... cannot transliterate character....

How to resolve this issue when accessing this kind characters or any other characters?

Thank You

WDI-Json commented 3 years ago

We have the same issue. However, it doesn't seem to matter which encoding we use... :(

tramesh2009 commented 3 years ago

We have the same issue. However, it doesn't seem to matter which encoding we use... :(

Thank you for your reply. Please share the encoding concept (if possible) what you are using.

One question : If use the encoding then character might be changed while retrieving from DB. Did you see this kind of issue?

Thank you.

bhorn commented 2 years ago

What encoding did you create the database with (in IBConsole)? I have had similar problems where the existing database uses charset=None, but data already exists (from a Win32 application) as 8-bits per char, and this driver on .NET Core is trying to read it as multibyte UTF8. I either got transliteration errors, or '?' when using 'charset=None' in the connection string. To fix it I had to use reflection to override the private _encoding field being used.

Here is a snippet if it helps (use with charset=None in connection string):

        // include nuget package System.Text.Encoding.CodePages
        using System.Text;
        using InterBaseSql.Data.InterBaseClient;
        ...
        Encoding.RegisterProvider(CodePagesEncodingProvider.Instance);

        var asm = Assembly.GetAssembly(typeof(IBConnection));
        var charsetType = asm.GetType("InterBaseSql.Data.Common.Charset");
        var pi = charsetType.GetProperty("DefaultCharset");
        var mi = pi.GetGetMethod();
        var inst = mi.Invoke(null, null);
        var pi2 = charsetType.GetField("_encoding", BindingFlags.NonPublic | BindingFlags.Instance);
        var encoding = System.Text.Encoding.GetEncoding("Windows-1252");
        pi2.SetValue(inst, encoding);
jeffovercash commented 2 years ago

Also, are you setting the character set in the connection string? Try setting UTF8 and see if that works. The server will transliterate between the character set as stored to UTF8 before putting it onto the wire and then the driver will take the UTF8 and translate to Unicode. There is code to do all that, but no unit tests were written originally in the Fb driver (the basis of this one) and I have not yet written some for testing setting hte lc_ctype in a connection string (there are connection string tests that character set is properly parsed, but not the use of it in a connection).

jeffovercash commented 2 years ago

In the next release, the default will be None not Utf8. None makes more sense in that the characters should not be normally translated unless the dev explicitly sets a character set (the lc_ctype) to have all data translated to that type.

What bhorn suggested was more how I wrote it in IBX over in Delphi. IBX stores the encoding at the IBDatabase level and when reading a char or varchar it gets encoded from the lc_ctype encoding to UniCode. When writing the reverse. The Fb original authors took the standpoint of interpreting the character set from SQLVar information and picking the right encoding from the charset class. In the case that bhorn gave his workaround for, neither really works. At the connection level it needs to be none or you get the can not transliterate, but at the DB field level, it does not know it is not normal ascii (ansi it is one of the two).

Normally you would just set the lc_ctype, but sometimes that can result in not able to transliterate errors.

jeffovercash commented 2 years ago

7.12.1 now uses none if no charset is set in the connection string.