ethereum / web3.py

A python interface for interacting with the Ethereum blockchain and ecosystem.
http://web3py.readthedocs.io
MIT License
5.01k stars 1.71k forks source link

Segmentation Fault when multi-threading with different RPCs. #2599

Closed ArshanKhanifar closed 2 years ago

ArshanKhanifar commented 2 years ago

I can see multiple similar issues (#1847 , #2409) but I guess they're not merged in yet. The issue persists even with the non-async Web3 module with no middlewares. Here's my reproducible example where this issue happens every single time.

aiohttp==3.8.1
aiosignal==1.2.0
aniso8601==9.0.1
async-timeout==4.0.2
attrs==22.1.0
base58==2.1.1
bitarray==1.2.2
cachetools==5.2.0
certifi @ file:///private/var/folders/nz/j6p8yfhx1mv_0grj5xl4650h0000gp/T/abs_884c889c-96af-444f-bd6d-daddb5e9a462ykj3l5n_/croots/recipe/certifi_1655968814730/work/certifi
charset-normalizer==2.1.0
click==8.1.3
cytoolz==0.12.0
db-dtypes==1.0.3
eth-abi==2.2.0
eth-account==0.5.9
eth-hash==0.3.3
eth-keyfile==0.5.1
eth-keys==0.3.4
eth-rlp==0.2.1
eth-typing==2.3.0
eth-utils==1.10.0
Flask==2.1.3
flask-restx==0.5.1
frozenlist==1.3.1
google-api-core==2.8.2
google-auth==2.10.0
google-auth-oauthlib==0.5.2
google-cloud-bigquery==3.3.0
google-cloud-bigquery-storage==2.14.1
google-cloud-core==2.3.2
google-crc32c==1.3.0
google-resumable-media==2.3.3
googleapis-common-protos==1.56.4
grpcio==1.47.0
grpcio-status==1.47.0
hexbytes==0.2.2
idna==3.3
importlib-metadata @ file:///private/var/folders/nz/j6p8yfhx1mv_0grj5xl4650h0000gp/T/abs_croot-5pqd2z6f/importlib-metadata_1648710902288/work
ipfshttpclient==0.8.0a2
itsdangerous==2.1.2
Jinja2==3.1.2
jsonschema==4.9.1
keyring @ file:///Users/builder/miniconda3/envs/prefect/conda-bld/keyring_1638777422651/work
lru-dict==1.1.8
MarkupSafe==2.1.1
multiaddr==0.0.9
multidict==6.0.2
netaddr==0.8.0
numpy==1.23.1
oauthlib==3.2.0
packaging==21.3
pandas==1.4.3
pandas-gbq==0.17.7
parsimonious==0.8.1
proto-plus==1.20.6
protobuf==3.20.1
pyarrow==8.0.0
pyasn1==0.4.8
pyasn1-modules==0.2.8
pycryptodome==3.15.0
pydata-google-auth==1.4.0
pyparsing==3.0.9
pyrsistent==0.18.1
python-dateutil==2.8.2
pytz==2022.1
requests==2.28.1
requests-oauthlib==1.3.1
rlp==2.0.1
rsa==4.9
six==1.16.0
toolz==0.12.0
urllib3==1.26.11
varint==1.0.2
web3==5.30.0
websockets==9.1
Werkzeug==2.1.2
yarl==1.8.1
zipp @ file:///private/var/folders/nz/j6p8yfhx1mv_0grj5xl4650h0000gp/T/abs_66c2c5f2-5dd5-4946-a16a-72af650ebd6cnmz4ou0f/croots/recipe/zipp_1652343960956/work

What was wrong?

Segfault when multi-threading with multiple RPCs:

from web3 import Web3

def get_block_number(rpc: str): w3 = Web3(Web3.HTTPProvider(rpc)) block_number = w3.eth.get_block_number() return block_number

def get_bn_chains(): with ThreadPoolExecutor(max_workers=len(rpcs)) as exc: tasks = [exc.submit(get_block_number, rpc) for rpc in rpcs] results = [r.result() for r in tasks] print(f"results: {results}") return results

if name == 'main': rpcs = [ "https://api.mycryptoapi.com/eth", "https://api.avax.network/ext/bc/C/rpc", "https://rpc.ftm.tools", "https://polygon-rpc.com", "https://arb1.arbitrum.io/rpc", "https://mainnet.optimism.io", "https://bsc-dataseed1.binance.org", "https://mainnet.aurora.dev", "https://eth.bd.evmos.org:8545", "https://http-mainnet.hecochain.com", "https://evm.cronos.org", "https://mainnet.boba.network", "https://rpc.gnosischain.com", "https://rpc.gnosischain.com", "https://evm.kava.io", "https://rpc.callisto.network" ]

print("first round")
get_bn_chains()
print("second round")
get_bn_chains()

* The full output of the error

/Users/arshankhanifar/miniconda3/envs/bt_defi_tx/bin/python /Users/arshankhanifar/tpc_aux_services/etl_services/bt_defi_tx/samples/find_segfault.py first round results: [15303145, 18385655, 44492119, 31667322, 19742334, 18360836, 20268554, 71583519, 2655226, 17696348, 4068760, 764415, 23595692, 23595687, 1025761, 10469308] second round

Process finished with exit code 139 (interrupted by signal 11: SIGSEGV)


* What type of node you were connecting to.
RPC http node, I found all of them on [chainlist](https://chainlist.org/).

### How can it be fixed?

Monkey-patching as shown [here](https://github.com/ethereum/web3.py/issues/1847#issuecomment-1086109749) fixes it. 

Here's my new code:

import threading from concurrent.futures import ThreadPoolExecutor from typing import Callable

import web3._utils.request from web3 import Web3

lock = threading.Lock()

def locked(fn: Callable) -> Callable: def inner(*args, *kwargs): with lock: return fn(args, **kwargs)

return inner

web3._utils.request.cache_session = locked(web3._utils.request.cache_session) web3._utils.request._get_session = locked(web3._utils.request._get_session)

def get_block_number(rpc: str): w3 = Web3(Web3.HTTPProvider(rpc)) block_number = w3.eth.get_block_number() return block_number

def get_bn_chains(): with ThreadPoolExecutor(max_workers=len(rpcs)) as exc: tasks = [exc.submit(get_block_number, rpc) for rpc in rpcs] results = [r.result() for r in tasks] print(f"results: {results}") return results

if name == 'main': rpcs = [ "https://api.mycryptoapi.com/eth", "https://api.avax.network/ext/bc/C/rpc", "https://rpc.ftm.tools", "https://polygon-rpc.com", "https://arb1.arbitrum.io/rpc", "https://mainnet.optimism.io", "https://bsc-dataseed1.binance.org", "https://mainnet.aurora.dev", "https://eth.bd.evmos.org:8545", "https://http-mainnet.hecochain.com", "https://evm.cronos.org", "https://mainnet.boba.network", "https://rpc.gnosischain.com", "https://rpc.gnosischain.com", "https://evm.kava.io", "https://rpc.callisto.network" ]

print("first round")
get_bn_chains()
print("second round")
get_bn_chains()

And here's the output for it: 

/Users/arshankhanifar/miniconda3/envs/bt_defi_tx/bin/python /Users/arshankhanifar/tpc_aux_services/etl_services/bt_defi_tx/samples/find_segfault.py first round results: [15303155, 18385721, 44492225, 31667386, 19742473, 18360979, 20268599, 71583620, 2655299, 17696393, 4068829, 764419, 23595719, 23595719, 1025782, 10469321] second round results: [15303155, 18385721, 44492225, 31667386, 19742473, 18360980, 20268599, 71583621, 2655300, 17696394, 4068829, 764419, 23595719, 23595719, 1025783, 10469321]

Process finished with exit code 0

fselmo commented 2 years ago

Hey @ArshanKhanifar. It looks like the PR at #2409 addresses this as well. Linking here to track this there as well. I'm in the middle of writing tests for sync and tweaking some things on the sync side but that PR should be good to go soon. Please feel free to join the conversation there.

ArshanKhanifar commented 2 years ago

sorry which PR, I think you accidentally linked this issue haha.

fselmo commented 2 years ago

hah, sorry... I fixed it

fselmo commented 2 years ago

For clarification, the sync code will probably change to resemble the async code a bit more except it should be more straightforward / no need for the asynccontextmanager

ArshanKhanifar commented 2 years ago

Understood, thanks.

fselmo commented 2 years ago

Resolved in #2409 - still needs to be ported to v5 but was merged into web3.py v6 beta and should be out in that next release.