graphsense / graphsense-blocksci

A dockerized component to synchronize BlockSci data to Apache Cassandra
MIT License
20 stars 8 forks source link

Crash in ingest script #12

Closed dshatz closed 3 years ago

dshatz commented 3 years ago

The crash below is happening almost every time on our machine, when running ltc-ingest script through graphsense-setup project. The exact moment is different each time. After a brief look at the code, seems that counter, field of QueryManager class contains the lock inside, which the multiprocessing library attempts to pickle to send between processes. Will continue the investigation.

...
ingest-ltc_1                 | #tx 37,940,000
ingest-ltc_1                 | #tx 37,950,000
ingest-ltc_1                 | #tx 37,960,000
ingest-ltc_1                 | Traceback (most recent call last):
ingest-ltc_1                 |   File "/usr/local/bin/blocksci_export.py", line 531, in <module>
ingest-ltc_1                 |     main()
ingest-ltc_1                 |   File "/usr/local/bin/blocksci_export.py", line 497, in main
ingest-ltc_1                 |     qm.execute(TxQueryManager.insert, tx_index_range)
ingest-ltc_1                 |   File "/usr/local/bin/blocksci_export.py", line 40, in wrap
ingest-ltc_1                 |     result = f(*args, **kw)
ingest-ltc_1                 |   File "/usr/local/bin/blocksci_export.py", line 94, in execute
ingest-ltc_1                 |     self.pool.map(fun, chunk(params, self.num_chunks))
ingest-ltc_1                 |   File "/usr/lib/python3.8/multiprocessing/pool.py", line 364, in map
ingest-ltc_1                 |     return self._map_async(func, iterable, mapstar, chunksize).get()
ingest-ltc_1                 |   File "/usr/lib/python3.8/multiprocessing/pool.py", line 771, in get
ingest-ltc_1                 |     raise self._value
ingest-ltc_1                 | multiprocessing.pool.MaybeEncodingError: Error sending result: '<multiprocessing.pool.ExceptionWithTraceback object at 0x7f538426d790>'. Reason: 'TypeError("cannot pickle '_thread.RLock' object")'

Have this worked fine before?

defconst commented 3 years ago

I created a new LTC keyspace and tested the docker-compose ingest, works fine (PROCESSES=26 on a machine with 28 physical cores)

behas commented 3 years ago

not reproducible; therefore closing now

dshatz commented 3 years ago

I have confirmed that this https://stackoverflow.com/a/56753695/4730000 is the cause of the error. Cassandra-driver 3.18.0 and later return a different resultset than before and the new ResultSet is apparently not pickable, because there is a Lock somewhere inside. On our machines the ingest script crashed every time. After downgrading to cassandra 3.17.1 it started to succeed every time.

I will see what can be done to fix this without needing to downgrade.

yaz1 commented 2 years ago

I am experiencing the same error with cassandra-driver > 3.17.1. Have you found a solution without needing to downgrade?