load some char(64) data casue backend crash

amutu commented 10 years ago

CREATE TABLE error4 ( ts timestamp without time zone, data character(64) );

postgres=# select data::bytea from test.error4;

data

\x0b2a2a2a212a2a2ac2a12a2a2ae480a12a2a2ae480a12a2a2ad0a12a2a2a202a2a2ac2a12a2a2a212a2a2ad0a12a2a2ae492a02a2a2ad0a12a2a2ae480a02a2a2ae482a02a2a2ae482a12a2a2ac2a02a2a2a (1 row)

postgres=# select length(data::bytea) from test.error4;

length

(1 row)

postgres=# select length(data) from test.error4;

length

(1 row)

postgres=# select data from test.error4;

data

\x0B_!_¡_䀡_䀡_С **¡_!_С_䒠_С_䀠_䂠_䂡 ** (1 row)

I find the len of columnar_store_load() get 82 instead of 64,this cause imcs_append_char() seg fault. str = (char*)vimVARDATA(t); len = VARSIZE(t) - VARHDRSZ; if (attr_type_oid[i] == BPCHAROID) { while (len != 0 && str[len-1] == ' ') { len -= 1; } } imcs_append_char(ts, str, len);-----!!!-here the len is 82,cause memset seg fault.

amutu commented 10 years ago

I think it is about some wide char,because: char_length(data) get 64,but octet_length get 82.

knizhnik commented 10 years ago

I have added check for too long string to avoid server crash in such cases. But you are right the source of the problem is that CHARACTER type in PostgreSQL by default corresponds to unicode character and IMCS stores bytes. One of the possible workarounds is to increase size of field:

CREATE TABLE error4 ( ts timestamp without time zone, data character(100) );

It will not have any influence on storing this data in PostgreSQL (since all strings are stored as varying length data in any case), but IMCS will use larger element size and o your string will fit in it.

Another solution will be to automatically multiply size of type on maximal number of bytes needed to represent wide character. But this multiplier can be quite larger - some exotic Unicode character requires more than 4 multibytes characters. So I do not like this idea.

amutu commented 10 years ago

thanks for your explainning,I will increase the char type.I think this ticket can be closed.

knizhnik / imcs