Decode class for "Compact<u32>" not found

zemahran commented 1 year ago

Are there any updates regarding this error? Is there a way to overcome it/come around it for the timebeing?

arjanz commented 1 year ago

Could you provide a bit more context, like logs or example code to reproduce? The type Compact<u32> is so common, so probably is the result of another problem.

zemahran commented 1 year ago

Here's the error, when trying to retrieve a block from polkadot using Polkadot's relay chain mainnet:

Any ideas on how to surpass it, or even find an implementation of this decoder class elsewhere? Thanks for trying to help @arjanz 🙏

arjanz commented 1 year ago

This is odd.. Retrieving a block from Polkadot is such a basic tasks, this shouldn't go wrong under normal circumstances. Can you tell a bit more about your setup (OS, Python version, etc).

Are you using threads or some other kind of async configuration? Because then it seems a like described in #246

I wasn't able to reproduce that scenario yet, but I'll run some more tests.

scottyeager commented 1 year ago

I'm also seeing the same error, along with this one:

'NoneType' object has no attribute 'portable_registry'

My use case is querying blocks from a Substrate based chain in parallel using threads. I use a concurrent.futures.ThreadPoolExecutor to spin up 25 threads. Each thread creates its own instance of SubstrateInterface and loops through a range of block numbers calling get_block for each one. Only a few threads show errors in each run, the others work normally.

I can try slimming my code down to a minimal example if that would be helpful.

What I noticed is that this only occurs in a fresh environment. Previously I did not have the issue running the same code using older package versions. I tested on both Python 3.10 and 3.11.

Here's the working package list:

base58==2.1.1
certifi==2022.9.24
cffi==1.15.1
charset-normalizer==2.1.1
cytoolz==0.12.0
ecdsa==0.18.0
eth-hash==0.3.3
eth-keys==0.4.0
eth-typing==3.2.0
eth-utils==2.0.0
idna==3.4
more-itertools==9.0.0
py-bip39-bindings==0.1.10
py-ed25519-zebra-bindings==1.0.1
py-sr25519-bindings==0.1.5
pycparser==2.21
pycryptodome==3.15.0
PyNaCl==1.5.0
requests==2.28.1
scalecodec==1.0.45
six==1.16.0
substrate-interface==1.3.2
toolz==0.12.0
urllib3==1.26.12
websocket-client==1.4.1
xxhash==3.1.0

Versus the fresh install that produces the error:

ansible==2.9.12
argcomplete==3.0.8
base58==2.1.1
certifi==2023.5.7
cffi==1.15.1
charset-normalizer==3.2.0
cytoolz==0.12.1
dotty-dict==1.3.1
ecdsa==0.18.0
eth-hash==0.5.2
eth-keys==0.4.0
eth-typing==3.4.0
eth-utils==2.2.0
halo==0.0.31
hid==1.0.5
hjson==3.1.0
idna==3.4
Jinja2==3.0.3
log-symbols==0.0.14
milc==1.6.6
more-itertools==9.1.0
multidict==6.0.4
netaddr==0.8.0
passlib==1.7.2
py-bip39-bindings==0.1.11
py-ed25519-zebra-bindings==1.0.1
py-sr25519-bindings==0.2.0
pycparser==2.21
pycryptodome==3.18.0
PyNaCl==1.5.0
qmk==1.1.2
requests==2.31.0
scalecodec==1.2.6
six==1.16.0
spinners==0.0.24
substrate-interface==1.7.3
toolz==0.12.0
urllib3==2.0.4
websocket-client==1.6.1
xxhash==3.2.0
yarl==1.8.2

The versions in my working set are more recent than those reported in #246, so that suggests there is not a common root cause that was introduced in the set of updates between my two package sets. That is, the issue reported there was already occurring in versions earlier than those that worked for me.

steinerkelvin commented 7 months ago

We are getting the same error, only while multi-threading.

It seems that downgrading substrate-interface to 1.3.2 and scalecodec to 1.0.45 didn't solve the problem.

scottyeager commented 3 months ago

I'm still seeing this issue with scalecodec==1.2.8 and substrate-interface==1.7.7. Seems my hunch about a certain version combo working was just due to confusion or luck.

I have here a minimal script that reproduces the issue on the majority of runs. This is on Python 3.11.8 on Linux.

import threading, queue, time
from substrateinterface import SubstrateInterface

WORKERS = 25
BLOCKS = 10_000

def worker(block_queue):
    substrate = SubstrateInterface(url="wss://rpc.polkadot.io")
    while 1:
        block_number = block_queue.get()
        substrate.get_block(block_number=block_number)
        # Do something fun with the block
        block_queue.task_done()

substrate = SubstrateInterface(url="wss://rpc.polkadot.io")
head_number = substrate.get_block_header()['header']['number']

block_queue = queue.Queue()
for i in range(head_number - BLOCKS, head_number + 1):
    block_queue.put(i)

threads = []
for i in range(WORKERS):
    thread = threading.Thread(target=worker, args=[block_queue])
    thread.start()
    threads.append(thread)

while block_queue.qsize() > 10:
    time.sleep(30)
    print(block_queue.qsize(), 'blocks remaining,', len([t for t in threads if t.is_alive()]), 'threads alive')

block_queue.join()

In my experience, it's often only a single thread that has the error and always toward the beginning of the run. My workaround is to just let the thread(s) die and then spawn new ones to take their place. The replacement threads seem to never fail. So the failure condition appears to be something around the simultaneous launching of many threads in a fresh environment.

Hope this can help you to reproduce and get to the bottom of this one, @arjanz. Thanks!

polkascan / py-substrate-interface

Decode class for "Compact<u32>" not found #336