Index of many unicode characters results in "internal gds software consistency check" [CORE1502]

firebird-automations commented 17 years ago

Submitted by: Richard Wesley (hawkfish)

This looks a lot like CORE1049 et alia, but it is showing up in 2.0.3:

CREATE TABLE "Sheet1$" ( "Appearance" VARCHAR(6) CHARACTER SET UTF8 COLLATE UNICODE, "Decimal" FLOAT(53), "Name" VARCHAR(83) CHARACTER SET UTF8 COLLATE UNICODE, "Position" VARCHAR(6) CHARACTER SET UTF8 COLLATE UNICODE );

INSERT INTO "Sheet1$" ("Appearance", "Decimal", "Name", "Position") VALUES(?, ?, ?, ?);

// At this point, I inserted a table of about 10600 rows with
distinct single character ucs2 code points in the "Appearance" column // using the API. Many thanks to my sadistic test cr?e... // We use UTF8 for the communication character set and convert the
ucs2 to utf8 using the standard Windows character conversion // routines.

// If I now say:

CREATE INDEX "_tidx_128_1a" ON "Sheet1$" ("Appearance");

// we log the following error in our application:

--- FB Error
--------------------------------------------------------------- File: db\firebirdprotocol.cpp, Line: 1921 Status: 335544333 internal gds software consistency check (index key too big (174),
file: idx.cpp line: 448) ------------------------------------------------------------------------ ----

firebird-automations commented 17 years ago

Commented by: Richard Wesley (hawkfish)

This is a zip of a Firebird database. I think it has a .tde extension but you can just change that to fdb.

firebird-automations commented 17 years ago

Modified by: Richard Wesley (hawkfish)

Attachment: CREATETABLETEST\.FDB\.zip \[ 10610 \]

firebird-automations commented 16 years ago

Commented by: Richard Wesley (hawkfish)

The issue does not appear in 2.1b1.

firebird-automations commented 16 years ago

Modified by: @pcisar

Workflow: jira \[ 13260 \] =\> Firebird \[ 13990 \]

firebird-automations commented 15 years ago

Commented by: @asfernandes

There is a record with the character U+FDFA. You can see it here: http://www.fileformat.info/info/unicode/char/fdfa/index.htm.

Note its decomposition in many others characters. When getting sort key of it, ICU returns 55 bytes. How can we deal with it? With our current fixed size buffers for keys, there is no way...

The attached test case has synthetic data, so it seems this problem is not affecting our users, so I believe it's low priority.

FirebirdSQL / firebird

Index of many unicode characters results in "internal gds software consistency check" [CORE1502] #1916