pgjdbc / pgjdbc

Postgresql JDBC Driver
http://jdbc.postgresql.org
BSD 2-Clause "Simplified" License
1.45k stars 823 forks source link

Whitespace dropped from array string element #1265

Open rodneykinney opened 5 years ago

rodneykinney commented 5 years ago

Using postgresql-42.2.4

The driver will drop whitespace when parsing elements from an array of VARCHAR, if the server chooses to format the array without using quotes to enclose the elements.

corpus=> select version();
                                   version                                    
------------------------------------------------------------------------------
 PostgreSQL 9.6.3 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.9.3, 64-bit

corpus=> select middle_names, aliases from author where ai2_id = '2869674';
 middle_names |       aliases       
--------------+---------------------
 {P T}        | {"Jeremy P T Ward"}

In the above case, both columns contain a single element, but the server uses quotes to enclose the elements only for one of the columns. Consequently, the driver extracts the middle_names column as ["PT"] instead of ["P T"]. When the server uses binary encoding, the correct value is extracted.

The following unit test illustrates the failure. Reproduced using mocks since it's hard to control how the server decides to encode the result.

TypeInfo mockTypeInfo = Mockito.mock(TypeInfo.class);

Mockito.when(mockTypeInfo.getArrayDelimiter(Matchers.eq(0))).thenReturn(',');
Mockito.when(mockTypeInfo.getPGArrayElement(Matchers.eq(0))).thenReturn(1);
Mockito.when(mockTypeInfo.getSQLType(Matchers.eq(1))).thenReturn(Types.VARCHAR);

BaseConnection mockConnection = Mockito.mock(BaseConnection.class);
Mockito.when(mockConnection.getTypeInfo()).thenReturn(mockTypeInfo);

String[] arr = (String[]) new PgArray(mockConnection, 0, "{A B}").getArray();

assertEquals(arr[0], "A B");
vlsi commented 5 years ago

Is this a duplicate of #1257 ?

rodneykinney commented 5 years ago

Yes, this looks like a duplicate. In my example, the array elements that are unquoted contain non-ASCII whitespace.