Encoding problem with CommandText

GoogleCodeExporter commented 9 years ago

What steps will reproduce the problem?
1. string query = "SELECT field FROM table WHERE filed = \"čevapčič\"";
2. MessageBox.Show("UTF string: " + query);
3. cmd.CommandText = query;
4. MessageBox.Show("CommandText: " + cmd.CommandText);

What is the expected output? What do you see instead?

I expected to get two MessageBoxes with this text:

MessageBox1:

UTF string: SELECT field FROM table WHERE filed = "čevapčič"

MessageBox2:

CommandText: SELECT field FROM table WHERE filed = "čevapčič"

Instead the second MessageBox was:

CommandText: SELECT field FROM table WHERE filed = "??evap??i??" 

And I get zero rezults while he should find one.

What version of the product are you using? On what operating system?

Latest downloadable version 3.6.23 on Windows XP.

Please provide any additional information below.

I tried all possible solution but can't get it to return any record with 
Slovenian characters. Please provide a solution.

Thank you in advanced

Original issue reported on code.google.com by thehijac...@gmail.com on 1 Sep 2010 at 7:53

GoogleCodeExporter commented 9 years ago


Hi, I also have the same problem as thehijcacker describes above, my case is 
using JPANAESE though.

I'm using the library like the following:

var query = "select count(*) from table where (search_text like '%あ%')";
var rc = Sqlite3.sqlite3_exec(_db, query, (pArg, nArg, azArgs, azCols) =>
    {
        // Do something to retrieve values
        return Sqlite3.SQLITE_OK;
    }, null, ref errorMessage);

It's supposed to return 278, but I get 867 which is a number of all of the row 
count in the table. Seems LIKE keyword doesn't work at all.

My environment, latest version 3.6.23, and on Windows 7 64bit.

Any help would be appreciated.

Original comment by yooonthe...@gmail.com on 3 Sep 2010 at 8:31

GoogleCodeExporter commented 9 years ago

I've tried this instead:

http://sqlite.phxsoftware.com/

Hardly had to change any of my source code. And its working perfeclty. No 
problems with slovenian characters. I was also able to easily add database 
encryption and added extra character collation for slovenian characters so all 
the table sorting works.

Sorry to say but this never worked for me with csharp-sqlite.

Original comment by thehijac...@gmail.com on 3 Sep 2010 at 8:37

GoogleCodeExporter commented 9 years ago

Original comment by noah.hart@gmail.com on 3 Sep 2010 at 8:49

Changed state: Accepted
Added labels: Milestone-3.7.2

GoogleCodeExporter commented 9 years ago

Hi Noah,

I really need to fix this problem asap instead of using the System.Data.SQLite, 
since my app is using Silverlight. So it would be appreciated if you give me 
some tips where to look at in the source code to fix this problem. I've already 
dived into the library though, it's pretty huge and didn't get where the 
problem is.

Thanks in advance,
Yoo

Original comment by yooonthe...@gmail.com on 5 Sep 2010 at 4:24

GoogleCodeExporter commented 9 years ago

Start with patternCompare in func_c.cs, notice that it calls sqlite3Utf8Read;

I suspect that the problem is in sqlite3Utf8Read or sqlite3Utf8Trans1 in 
utf_c.cs

This routine was an inline macro under C, but rewritten as a function in C#

I hope this helps, and I'll try to look at it next week as well.

Original comment by noah.hart@gmail.com on 5 Sep 2010 at 1:37

GoogleCodeExporter commented 9 years ago

Original comment by noah.hart@gmail.com on 5 Sep 2010 at 1:38

Added labels: Priority-High
Removed labels: Priority-Medium

GoogleCodeExporter commented 9 years ago

Thanks Noah! That must be helpful. I'll take a look at them.

Original comment by yooonthe...@gmail.com on 6 Sep 2010 at 1:49

GoogleCodeExporter commented 9 years ago

Okay, I think I found where the problem is. Yes, you are right. The problem 
lies in sqlite3Utf8Read. In native SQLite, it retrieves a character byte by 
byte from an array as follows: 
c = *(zIn++);

So that you can always get a byte. In contrast, in csharp sqlite, you do the 
same thing, though, against string, if it contains 2 bytes characters such as 
Japanese, it returns two bytes, not a byte. So, I cannot get an expected 
result. 

int c = zIn[zIndex++];

I'll try to modify the source code right now, and keep you posted.

Yoo

Original comment by yooonthe...@gmail.com on 6 Sep 2010 at 6:37

GoogleCodeExporter commented 9 years ago

Hi,

I solved this problem. As I said, the problem was in sqlite3Utf8Read as follows.
int c = zIn[zIndex++];

I changed zIn to List<byte>, and modified all the code calling sqlite3Utf8Read, 
then everything goes pretty well. 

But unfortunately my manager doesn't like SQLite for some reason, and I have to 
use Xml as an alternative of SQLite.

Anyway, thank you Noah, your help was really big!

Yoo

Original comment by yooonthe...@gmail.com on 7 Sep 2010 at 12:03

GoogleCodeExporter commented 9 years ago

Analysis: c was already unicode because parameter zIn was string, not byte array

Proposed Solution, will be part of 3.7.2;  

utf_c.cs line 209
  -if ( c > 0xff ) c = 0;
  -else
  -  {
  -c = sqlite3Utf8Trans1[c - 0xc0];

  +//if ( c > 0xff ) c = 0;
  +//else
  +  {
  +//c = sqlite3Utf8Trans1[c - 0xc0];

Original comment by noah.hart@gmail.com on 7 Sep 2010 at 4:38

Changed state: Fixed

GoogleCodeExporter commented 9 years ago

Issue 73 has been merged into this issue.

Original comment by noah.hart@gmail.com on 7 Sep 2010 at 4:39

mainjo / csharp-sqlite

Encoding problem with CommandText #78