influxdata / flightsql-dbapi

DB API 2 interface for Flight SQL with SQLAlchemy extras.
Apache License 2.0
33 stars 5 forks source link

Slow queries returning "RST_STREAM with error code 2" #19

Open aidanalphafund opened 1 year ago

aidanalphafund commented 1 year ago

When running a slow query (30+ seconds), I'm getting the following error:

pyarrow._flight.FlightInternalError: Flight returned internal error, with message: Received RST_STREAM with error code 2. gRPC client debug context: UNKNOWN:Error received from peer ipv4:54.174.236.48:443 {created_time:"2023-05-26T17:31:53.650218-06:00", grpc_status:13, grpc_message:"Received RST_STREAM with error code 2"}. Client context: OK

I'm storing stock market information in my database, and running this query:

        SELECT 
        DATE_BIN(INTERVAL '5 minutes', time) as date,
        symbol,
        selector_first(open, time)['value'] as open,
        selector_last(close, time)['value'] as close,
        selector_max(high, time)['value'] as high,
        selector_min(low, time)['value'] as low,
        sum(volume) as volume
        from minute_bar
        GROUP BY symbol, date
        ORDER BY symbol, date

When running using the influxdb data explorer, this query takes around 2 minutes to run. I consistently get this same error when using the flightsql-dbapi python library to run the query

aidanalphafund commented 1 year ago

I just realized this error might be related to the size of the returned dataset, not the time it takes to stream the response body.

aidanalphafund commented 1 year ago

Just following up - I have been able to verify that the error occurs on slow queries only returning a single row. It doesn't appear to be a problem with the number of rows returned.