OSGeo / gdal

GDAL is an open source MIT licensed translator library for raster and vector geospatial data formats.
https://gdal.org
Other
4.65k stars 2.46k forks source link

SWIG Layer.SetAttributeFilter method fails with MSSQL for non utf-8 strings #10225

Open ttruefix14 opened 3 weeks ago

ttruefix14 commented 3 weeks ago

What is the bug?

SWIG SetAttributeFilter method converts strings to utf-8 byte array. As i understand MSSQL trying to decode string as cp1251 encoded (or other depending on MSSQL collation) and filter is not working properly

Tried this with C# bindings and python bindings with the same result.

Linked issue: MaxRev-Dev/gdal.netcore#135

Steps to reproduce the issue

Use SetAttributeFilter with any cyrillic symbols on MSSQL dataset with cyrillic collation.

Versions and provenance

OS: Windows 11 GDAL version: 3.9.0 C# library: https://github.com/MaxRev-Dev/gdal.netcore Python library: GDAL 3.3.3 MSSQL: docker image mcr.microsoft.com/mssql/server:2019-latest with collation: Cyrillic_General_CI_AS

Additional context

Temporary Solution Invoke gdal function directly and pass string with proper encoding (in c#) or properly encoded bytearray (C# and python)

In C# with cyrillic encoding (string)

OGR_L_SetAttributeFilter(Layer.getCPtr(inputLayer), filter);

[DllImport("gdal")]
static extern int OGR_L_SetAttributeFilter(HandleRef layer, [MarshalAs(UnmanagedType.LPStr)] string filter);

In C# with cyrillic encoding (bytearray)

OGR_L_SetAttributeFilter(Layer.getCPtr(inputLayer), Encoding.GetEncoding(1251).GetBytes(filter));

[DllImport("gdal")]
static extern int OGR_L_SetAttributeFilter(HandleRef layer, byte[] filter);

In Python with cyrillic encoding

import ctypes as C
gdal_dll = C.CDLL("gdal\dll\path")
gdal_dll.OGR_L_SetAttributeFilter(C.c_void_p(int(input_layer.this)), filter_string.encode("cp1251"))

Sugestion I think best solution would be modify OGRSQL converting to transact sql - passing all strings like unicode (N'string') (if i understand the problem correctly) or add overload of SetAttributeFilter in SWIG with bytearray filter argument so we can pass propetly encoded byte arrays to method if needed

rouault commented 3 weeks ago

CC @szekerest