sfu-db / connector-x

Fastest library to load data from DB to DataFrames in Rust and Python
https://sfu-db.github.io/connector-x
MIT License
2k stars 162 forks source link

PanicException: called `Result::unwrap()` on an `Err` value: Os { code: 35, kind: WouldBlock, message: "Resource temporarily unavailable" } #421

Open bitnlp opened 1 year ago

bitnlp commented 1 year ago

What language are you using?

Python

What version are you using?

connectorx 0.3.1

What database are you using?

sqlite3

What dataframe are you using?

Pandas

Can you describe your bug?

The exception happens while selecting in a loop. After approx 1500 iterations. If time.sleep(0.01) is added right after the query then the problem goes away.

What are the steps to reproduce the behavior?

Database setup if the error only happens on specific data or data type

Table schema and example data

Example query / code

result_df = cx.read_sql("sqlite://" + xdb_path, "SELECT * FROM '" + table_name + "'")

What is the error?

thread '' panicked at 'called Result::unwrap() on an Err value: Os { code: 35, kind: WouldBlock, message: "Resource temporarily unavailable" }', /Users/runner/.cargo/registry/src/github.com-1ecc6299db9ec823/scheduled-thread-pool-0.2.6/src/lib.rs:320:44 note: run with RUST_BACKTRACE=1 environment variable to display a backtrace Exception in thread Thread-1: Traceback (most recent call last): File "/opt/homebrew/Caskroom/miniforge/base/envs/tf/lib/python3.9/threading.py", line 973, in _bootstrap_inner self.run() File "/opt/homebrew/Caskroom/miniforge/base/envs/tf/lib/python3.9/threading.py", line 910, in run self._target(*self._args, *self._kwargs) File "/Users/mac/tensorflow_macos_venv/monitor_cutoff.py", line 663, in get_dataframes ResultDFArray[mode_index], FactorArray[mode_index], MinDaysArray[mode_index] = get_combined_data(mode_in_list) File "/Users/mac/tensorflow_macos_venv/monitor_cutoff.py", line 382, in get_combined_data result_df = cx.read_sql("sqlite://" + xdb_path, "SELECT FROM '" + table_name + "'") File "/opt/homebrew/Caskroom/miniforge/base/envs/tf/lib/python3.9/site-packages/connectorx/init.py", line 224, in read_sql result = _read_sql( pyo3_runtime.PanicException: called Result::unwrap() on an Err value: Os { code: 35, kind: WouldBlock, message: "Resource temporarily unavailable" }

UESTCRoy commented 1 year ago

Hi I met the same problem, how do you solve it?

magnusuMET commented 1 year ago

We are also experiencing the same problem with reading from sqlite in a loop leading to

pyo3_runtime.PanicException: called `Result::unwrap()` on an `Err` value: Os { code: 11, kind: WouldBlock, message: "Resource temporarily unavailable" }

where errno 11 is EAGAIN 11 Resource temporarily unavailable

The error seems occur after a certain number of iterations, although the number of iterations is not the same on different machines, varying from 3000 on one to 60 on another before crashing.

We have a minimal reproducer if someone wants to dig into this

capellini commented 1 month ago

pyo3_runtime.PanicException inherits from BaseException, so attempting to catch just an Exception won't work. You have to catch a BaseException, see if it's a PanicException, and handle that. Unfortunately, pyo3 does not expose PanicException.

Though not ideal, here is an example workaround:

import time
import connectorx as cx

MAX_ATTEMPTS = 3

def _exception_is_retryable(exc):
    return 'PanicException' in type(exc).__name__ and \
        'Resource temporarily unavailable' in str(exc)

def read_sql(connect_string, sql):
    attempts = 0
    delay = 0.5

    while True:
        try:
            return cx.read_sql(connect_string, sql)
        except BaseException as exc:
            attempts += 1

            if not _exception_is_retryable(exc) or attempts > MAX_ATTEMPTS:
                raise exc

            delay *= 2
            time.sleep(delay)

You could do this for any PanicException that you'd like to retry.