lurcher / unixODBC

The unixODBC Project goals are to develop and promote unixODBC to be the definitive standard for ODBC on non MS Windows platforms.
GNU Lesser General Public License v2.1
99 stars 52 forks source link

Nullable field in SQLSpecialColumns broken on macOS ARM64 #60

Open kadler opened 3 years ago

kadler commented 3 years ago

Running some tests while bringing up or driver on an M1 Mac I found some tests failing when passing SQL_NULLABLE to SQLSpecialColumns and getting no rows back, when there should have been. This test passes across multiple platforms and architectures (Mac, Windows, Linux, IBM i, x86, PPC) but is only failing on M1 Mac.

Diving in to the assembly, I think I understand the problem (ARM64 assembly is pretty new to me, though).

SQLSpecialColumns has 10 arguments. The calling convention on ARM64 has the first 8 in r0-r7 so the remaining two must be passed on the stack. They are SQLUSMALLING scope and SQLUSMALLINT nullable. Here's the assembly from the application:

0x10023c898 <+112>: ldur   x10, [x29, #-0x8]         // re-load parameters 1-8
0x10023c89c <+116>: ldurh  w1, [x29, #-0xa]             |
0x10023c8a0 <+120>: ldur   x2, [x29, #-0x18]            |
0x10023c8a4 <+124>: ldursh w3, [x29, #-0x1a]            |
0x10023c8a8 <+128>: ldr    x4, [sp, #0x28]              |
0x10023c8ac <+132>: ldrsh  w5, [sp, #0x26]              |
0x10023c8b0 <+136>: ldr    x6, [sp, #0x18]              |
0x10023c8b4 <+140>: ldrsh  w7, [sp, #0x16]              |
0x10023c8b8 <+144>: ldrh   w8, [sp, #0x14]              |
0x10023c8bc <+148>: ldrh   w9, [sp, #0x12]              |
0x10023c8c0 <+152>: mov    x0, x10                   
0x10023c8c4 <+156>: mov    x10, sp                   // set up args 9+
0x10023c8c8 <+160>: strh   w8, [x10]                 // push scope on stack
0x10023c8cc <+164>: strh   w9, [x10, #0x2]           // push nullable on stack
0x10023c8d0 <+168>: bl     0x10024ffd8               ; symbol stub for: SQLSpecialColumns

The two unsigned shorts are pushed to the first 4 bytes of the stack, then SQLSpecialColumns is called. Now, the driver manager does a bunch of stuff, but here's the relevant part right before the driver is called:

0x100176e38 <+1128>: ldr    x8, [x8, #0x1220]         // Load SQLSpecialColumns function pointer
0x100176e3c <+1132>: cbz    x8, 0x100176f48           // Check if null pointer
0x100176e40 <+1136>: ldr    x0, [x19, #0x420]         // Load driver's handle from application's handle
0x100176e44 <+1140>: stp    w24, w22, [sp]            // Push scope, nullable to the stack
0x100176e48 <+1144>: mov    x1, x20                   // set up args remaining args
0x100176e4c <+1148>: mov    x2, x11
0x100176e50 <+1152>: mov    x3, x25
0x100176e54 <+1156>: ldp    x4, x6, [sp, #0x50]
0x100176e58 <+1160>: mov    x5, x23
0x100176e5c <+1164>: mov    x7, x21
0x100176e60 <+1168>: blr    x8                        // Call driver

All this seems fine at first glance, but the problem is the stp w24, w22, [sp] instruction. This pushes two 32-bit values to the stack, but scope and nullable are 16-bit values!

The stack looks like this:

(lldb) memory read -c 8 0x000000016fdfe770
0x16fdfe770: 00 00 00 00 01 00 00 00                          ........

When we get to the driver code, the assembly looks like this

0x10051375c <+8>:  add    x29, sp, #0x10            ; =0x10 
0x100513760 <+12>: sub    sp, sp, #0x270            ; =0x270 
0x100513764 <+16>: ldrh   w8, [x29, #0x10]          // load scope from saved sp
0x100513768 <+20>: ldrh   w9, [x29, #0x12]          // load nullable from saved sp+2

(lldb) re read w8 w9
      w8 = 0x00000000 // SQL_SCOPE_CURROW
      w9 = 0x00000000 // SQL_NO_NULLS, should be SQL_NULLABLE

The C code for this is here: https://github.com/lurcher/unixODBC/blob/master/DriverManager/SQLSpecialColumns.c#L436-L446

        ret = SQLSPECIALCOLUMNS( statement -> connection ,
                statement -> driver_stmt,
                identifier_type,
                catalog_name,
                name_length1,
                schema_name,
                name_length2,
                table_name,
                name_length3,
                scope,
                nullable );

It calls through this macro, which finds the correct function pointer from the function pointer array and then calls it: https://github.com/lurcher/unixODBC/blob/master/DriverManager/drivermanager.h#L1453-L1455

The problem here is that all the functions are all generic function pointers: https://github.com/lurcher/unixODBC/blob/master/DriverManager/drivermanager.h#L151 and that means that C's integer promotion rules will kick in and promote the shorts to ints.

    SQLRETURN   (*func)();
    SQLRETURN   (*funcW)();             /* function with a unicode W */
    SQLRETURN   (*funcA)();             /* function with a unicode A */

If the function is cast explicitly to the actual function prototype for SQLSpecialColumns, ie. SQLRETURN (*)(SQLHSTMT, SQLUSMALLINT, SQLCHAR*, SQLSMALLINT, SQLCHAR*, SQLSMALLINT, SQLCHAR*, SQLSMALLINT, SQLUSMALLINT, SQLUSMALLINT) before calling, I suspect that would fix it.

lurcher commented 3 years ago

On 26/02/2021 03:59, Kevin Adler wrote:

Running some tests while bringing up or driver on an M1 Mac I found some tests failing when passing SQL_NULLABLE to SQLSpecialColumns and getting no rows back, when there should have been. This test passes across multiple platforms and architectures (Mac, Windows, Linux, IBM i, x86, PPC) but is only failing on M1 Mac.

Diving in to the assembly, I think I understand the problem (ARM64 assembly is pretty new to me, though).

SQLSpecialColumns has 10 arguments. The calling convention on ARM64 has the first 8 in r0-r7 so the remaining two must be passed on the stack. They are |SQLUSMALLING scope| and |SQLUSMALLINT nullable|. Here's the assembly from the application:

|0x10023c898 <+112>: ldur x10, [x29, #-0x8] // re-load all the parameters 0x10023c89c <+116>: ldurh w1, [x29, #-0xa] | 0x10023c8a0 <+120>: ldur x2, [x29, #-0x18] | 0x10023c8a4 <+124>: ldursh w3, [x29,

-0x1a] | 0x10023c8a8 <+128>: ldr x4, [sp, #0x28] | 0x10023c8ac

<+132>: ldrsh w5, [sp, #0x26] | 0x10023c8b0 <+136>: ldr x6, [sp,

0x18] | 0x10023c8b4 <+140>: ldrsh w7, [sp, #0x16] | 0x10023c8b8

<+144>: ldrh w8, [sp, #0x14] | 0x10023c8bc <+148>: ldrh w9, [sp,

0x12] | 0x10023c8c0 <+152>: mov x0, x10 0x10023c8c4 <+156>: mov x10,

sp // set up args 9+ 0x10023c8c8 <+160>: strh w8, [x10] // push scope on stack 0x10023c8cc <+164>: strh w9, [x10, #0x2] // push nullable on stack 0x10023c8d0 <+168>: bl 0x10024ffd8 ; symbol stub for: SQLSpecialColumns |

The two unsigned shorts are pushed to the first 4 bytes of the stack, then SQLSpecialColumns is called. Now, the driver manager does a bunch of stuff, but here's the relevant part right before the driver is called:

|0x100176e38 <+1128>: ldr x8, [x8, #0x1220] // Load SQLSpecialColumns function pointer 0x100176e3c <+1132>: cbz x8, 0x100176f48 // Check if null pointer 0x100176e40 <+1136>: ldr x0, [x19, #0x420] // Load driver's handle from application's handle 0x100176e44 <+1140>: stp w24, w22, [sp] // Push scope, nullable to the stack 0x100176e48 <+1144>: mov x1, x20 // set up args remaining args 0x100176e4c <+1148>: mov x2, x11 0x100176e50 <+1152>: mov x3, x25 0x100176e54 <+1156>: ldp x4, x6, [sp, #0x50] 0x100176e58 <+1160>: mov x5, x23 0x100176e5c <+1164>: mov x7, x21 0x100176e60 <+1168>: blr x8 // Call driver |

All this seems fine at first glance, but the problem is the |stp w24, w22, [sp]| instruction. This pushes two /32-bit/ values to the stack, but scope and nullable are 16-bit values!

The stack looks like this:

|(lldb) memory read -c 8 0x000000016fdfe770 0x16fdfe770: 00 00 00 00 01 00 00 00 ........ |

When we get to the driver code, the assembly looks like this

|0x10051375c <+8>: add x29, sp, #0x10 ; =0x10 0x100513760 <+12>: sub sp, sp, #0x270 ; =0x270 0x100513764 <+16>: ldrh w8, [x29, #0x10] // load scope from saved sp 0x100513768 <+20>: ldrh w9, [x29, #0x12] // load nullable from saved sp+2 (lldb) re read w8 w9 w8 = 0x00000000 // SQL_SCOPE_CURROW w9 = 0x00000000 // SQL_NO_NULLS, should be SQL_NULLABLE |

The C code for this is here: https://github.com/lurcher/unixODBC/blob/master/DriverManager/SQLSpecialColumns.c#L436-L446

     ret = SQLSPECIALCOLUMNS( statement -> connection ,
             statement -> driver_stmt,
             identifier_type,
             catalog_name,
             name_length1,
             schema_name,
             name_length2,
             table_name,
             name_length3,
             scope,
             nullable );

It calls through this macro, which finds the correct function pointer from the function pointer array and then calls it: https://github.com/lurcher/unixODBC/blob/master/DriverManager/drivermanager.h#L1453-L1455

The problem here is that all the functions are all generic function pointers: https://github.com/lurcher/unixODBC/blob/master/DriverManager/drivermanager.h#L151 and that means that C's integer promotion rules will kick in and promote the shorts to ints.

If the function is cast explicitly to the actual function prototype for SQLSpecialColumns, ie. |SQLRETURN ()(SQLHSTMT, SQLUSMALLINT, SQLCHAR, SQLSMALLINT, SQLCHAR, SQLSMALLINT, SQLCHAR, SQLSMALLINT, SQLUSMALLINT, SQLUSMALLINT)| before calling, I suspect that would fix it.

Well, my gut feeling is to first comment that the compiler seems broken to me and is assuming too much, but then someone will find a reference in C(somedate or other) that says that its allowed to do it.

However pragmatically, the place to add the fix would be drivermanager.h, though I guess for consistency, it needs the cast adding for all the functions.

I will put this one in the "why prototypes are actually a bad thing" box.

--

Nick

v-chojas commented 3 years ago

I'm not really familiar with ARM64 Asm either, but on other platforms like x86 and MIPS all parameters are normally widened to be at least the native bitness (32 for 32-bit, 64 for 64-bit) so that behaviour of passing 16-bit arguments without widening looks abnormal.

Looking at the ARM64 ABI doc at https://c9x.me/compile/bib/abi-arm64.pdf I see on page 19 "C.14 If the size of the argument is less than 8 bytes then the size of the argument is set to 8 bytes." along with other references to pushing at least 8 bytes at a time.

...which leads me to believe the bug is in the compiler too.

kadler commented 3 years ago

Seems that Apple's ABI diverges from the ARM64 ABI: https://developer.apple.com/documentation/xcode/writing_arm64_code_for_apple_platforms#//apple_ref/doc/uid/TP40013702-SW1

Pass Arguments to Functions Correctly

The stack pointer on Apple platforms follows the ARM64 standard ABI and requires 16-byte alignment. When passing arguments to functions, Apple platforms diverge from the ARM64 standard ABI in the following ways:

  • Function arguments may consume slots on the stack that are not multiples of 8 bytes. If the total number of bytes for stack-based arguments is not a multiple of 8 bytes, insert padding on the stack to maintain the 8-byte alignment requirements.
polluks commented 3 years ago

@kadler I assume you are using Apple clang version 12.0.5 (clang-1205.0.21.3)?

kadler commented 3 years ago

I currently have no updates available and this is the version I have:

Apple clang version 12.0.0 (clang-1200.0.32.29)
Target: arm64-apple-darwin20.3.0
Thread model: posix
InstalledDir: /Library/Developer/CommandLineTools/usr/bin

I'm not sure how you have 12.0.5.

kadler commented 3 years ago

FYI, I do plan to make a PR for this once I get a chance to build and test it locally. Tied up with other unrelated issues at the moment.

polluks commented 3 years ago

I'm not sure how you have 12.0.5.

12.5 beta 3

kadler commented 3 years ago

Created PR #69 to fix the issue. Turned out relatively big with all the warnings, but I did find some legit issues passing pointers of integers of the wrong size. That's why prototypes are actually good :wink:

lurcher commented 3 years ago

On 16/04/2021 21:05, Kevin Adler wrote:

Created PR #69 https://github.com/lurcher/unixODBC/pull/69 to fix the issue. Turned out relatively big with all the warnings, but I did find some legit issues passing pointers of integers of the wrong size. That's why prototypes are actually good 😉

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/lurcher/unixODBC/issues/60#issuecomment-821532555, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABYK62JQ63Z4F2XFTEDARVLTJCKCDANCNFSM4YHW67OA.

Thanks for that.

Only problem I can see is the platforms that don't have intptr_t

kadler commented 3 years ago

Only problem I can see is the platforms that don't have intptr_t

I thought about that, but it looked like it was already a hard requirement: https://github.com/lurcher/unixODBC/blob/master/DriverManager/drivermanager.h#L238

v-chojas commented 3 years ago

I looked through all the functions in the ODBC API, and it turns out that SQLSpecialColumns is the only one that gets broken by Apple's special ABI! The conditions to trigger it are that the function must have more than 9 parameters, and two consecutive parameters past the 8th are smaller than pointer-width. For example, SQLForeignKeys has 13 parameters but fortunately it interleaves SQLSMALLINT and SQLCHAR*'s for the last 5, and the last 3 parameters of SQLGetDescRec's 11 are all pointers.