Open chadfrommeyer opened 6 months ago
Hi @chadfrommeyer Thanks for submitting the request. We will investigate this internally and get back to you with feedbacks and questions.
Hi @vahid110 , On a similar note, with the previous driver (1.x) I was able to configure how the query results were handled on the client side in the 'Additional Configuration' tab (see below image). This helped prevent out of memory issues when querying significantly large result sets, as I could set the 'Cache Size' field to 100,000 (for example) to handle it piece-meal.
I noticed in the 2.x driver these options have been removed. Was there any reason for this, or they just simply haven't been implemented yet? I am hoping it's because the driver has been designed in such a way to prevent OOM errors automatically, but wanted to check first.
Apologies if this is covered in some documentation - I tried to find reference to it but couldn't. Feel free to link it to me if it is. Cheers
@imalobster Thanks for your comment, A similar option is available in 'Cursor' Tab under 'Streaming Cursor' section.
Hi @chadfrommeyer Apologies for delay in response. I tried to reproduce the issue, first using a simple synchronous python script. The memory usage was normal and the program ran to completion with no issue. I can run your same c# program too to see if async nature of commands is the cause of the crash. But it is worth noting that 500K rows with very small column number/size should not basically be the cause of crash. I remember running 800M rows select statements with no issues.
# connection_string = "DSN=testsystem2"
import pyodbc
import sys
import time
def do_odbc_test():
connection_string = "DSN=testsystem2"
connection = pyodbc.connect(connection_string)
cursor = connection.cursor()
cursor.execute("SELECT generate_series as myfield FROM GENERATE_SERIES(1, 500000, 1);")
count = 0
for row in cursor:
word = row[0]
print(f"({word}), ({count})")
count += 1
if (count % 5000) == 0:
print(f"({word}), ({count})")
cursor.close()
connection.close()
return 1
def main():
start_time = time.time()
result = do_odbc_test()
end_time = time.time()
print(f"Total time: {end_time - start_time:.2f} seconds")
return result
if __name__ == "__main__":
sys.exit(main())
Hi @chadfrommeyer again Here is your consolidated code and I can confirm it crashes.
using System.Collections.Generic;
using System.Data.Odbc;
using System.Diagnostics;
using System.Threading.Tasks;
using System.Data.Common;
using System;
using System.Threading;
namespace TestOdbcCommand
{
public class OdbcTest
{
public async static Task<int> DoOdbcTest()
{
string connectionString = "DSN=testsystem2";
using (OdbcConnection connection = new OdbcConnection(connectionString))
{
//connection.InfoMessage += InfoMessage_Event;
await connection.OpenAsync();
int count = 0;
using (OdbcCommand command = new OdbcCommand("SELECT generate_series as myfield FROM GENERATE_SERIES(1, 500000, 1);", connection))
{
using (DbDataReader reader = await command.ExecuteReaderAsync())
{
while (await reader.ReadAsync())
{
string word = reader.GetString(0);
Debug.WriteLine($"({word}), ({count++})");
if ((count % 50000) == 0)
{
Console.WriteLine($"({word}), ({count})");
// Thread.Sleep(10);
}
count++;
}
}
}
Console.WriteLine("Count for this run :" + count);
connection.Close();
}
return 1;
}
private static void InfoMessage_Event(object sender, OdbcInfoMessageEventArgs e)
{
Debug.WriteLine(e.Message);
}
}
}
namespace TestOdbcCommand
{
internal class Program
{
static void Main(string[] args)
{
Console.WriteLine("Hello World!");
Task<int> task = null;
int i = 0, runs = 30;
for (i = 0; i < runs; i++)
{
task = Task.Run(async () => await OdbcTest.DoOdbcTest());
task.Wait((30 + 1) * 1000 * 60);
}
if (runs == i)
Console.WriteLine("Done");
else
Console.WriteLine("Failed in run " + i);
}
}
}
Driver version
2.00.00.11
Redshift version
PostgreSQL 8.0.2 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.4.2 20041017 (Red Hat 3.4.2-6.fc3), Redshift 1.0.61631
Client Operating System
Windows 10 x64
Table schema
Single integer field. No table needed
Problem description
When selecting a very large number of rows our application crashes with the following error.
[Redshift][ODBC Driver]An I/O error occurred while reading streaming cursor.
The attached code reproduces the problem. I don't have a thread.sleep there, so it will hang while running.
ODBC trace logs
Trace logs attached (zip file). Here is the tail.
TestOdbcCommand 1edc-5dec EXIT SQLFetch with return code -1 (SQL_ERROR) HSTMT 0x0000020FF8D3D1C0
TestOdbcCommand 1edc-5dec ENTER SQLGetDiagRecW SQLSMALLINT 3
SQLHANDLE 0x0000020FF8D3D1C0
SQLSMALLINT 1
SQLWCHAR 0x000000A2B19FEA50
SQLINTEGER 0x000000A2B19FEC78
SQLWCHAR 0x0000020FFB144BD0
SQLSMALLINT 1024
SQLSMALLINT 0x000000A2B19FEC80
TestOdbcCommand 1edc-5dec EXIT SQLGetDiagRecW with return code 0 (SQL_SUCCESS) SQLSMALLINT 3
SQLHANDLE 0x0000020FF8D3D1C0
SQLSMALLINT 1
SQLWCHAR 0x000000A2B19FEA50 [ 5] "24000"
SQLINTEGER 0x000000A2B19FEC78 (0)
SQLWCHAR 0x0000020FFB144BD0 [ 76] "[Redshift][ODBC Driver]An I/O error occurred while reading streaming cursor."
SQLSMALLINT 1024
SQLSMALLINT 0x000000A2B19FEC80 (76)
TestOdbcCommand 1edc-5dec ENTER SQLGetDiagRecW SQLSMALLINT 3
SQLHANDLE 0x0000020FF8D3D1C0
SQLSMALLINT 2
SQLWCHAR 0x000000A2B19FEA50
SQLINTEGER 0x000000A2B19FEC78
SQLWCHAR 0x0000020FFB144BD0
SQLSMALLINT 1024
SQLSMALLINT 0x000000A2B19FEC80
TestOdbcCommand 1edc-5dec EXIT SQLGetDiagRecW with return code 100 (SQL_NO_DATA_FOUND) SQLSMALLINT 3
SQLHANDLE 0x0000020FF8D3D1C0
SQLSMALLINT 2
SQLWCHAR 0x000000A2B19FEA50
SQLINTEGER 0x000000A2B19FEC78
SQLWCHAR 0x0000020FFB144BD0
SQLSMALLINT 1024
SQLSMALLINT 0x000000A2B19FEC80
TestOdbcCommand 1edc-5dec ENTER SQLGetInfoW HDBC 0x0000020FF8D3A820 UWORD 6
PTR 0x0000020F801EBEB0
SWORD 100
SWORD * 0x000000A2B19FEC48
TestOdbcCommand 1edc-5dec EXIT SQLGetInfoW with return code 0 (SQL_SUCCESS) HDBC 0x0000020FF8D3A820 UWORD 6
PTR 0x0000020F801EBEB0 [ 20] "rsodbc.dll"
SWORD 100
SWORD * 0x000000A2B19FEC48 (20)
TestOdbcCommand 1edc-5dec ENTER SQLGetConnectAttrW SQLHDBC 0x0000020FF8D3A820 SQLINTEGER 1209
SQLPOINTER [Unknown attribute 1209]
SQLINTEGER 4
SQLINTEGER * 0x000000A2B19FEC48
TestOdbcCommand 1edc-5dec EXIT SQLGetConnectAttrW with return code 0 (SQL_SUCCESS) SQLHDBC 0x0000020FF8D3A820 SQLINTEGER 1209
SQLPOINTER [Unknown attribute 1209]
SQLINTEGER 4
SQLINTEGER * 0x000000A2B19FEC48 (0)
TestOdbcCommand 1edc-5dec ENTER SQLMoreResults HSTMT 0x0000020FF8D3D1C0
Reproduction code
Reproduction code attached.
TraceLog.zip TestOdbcCommand.zip