blockchain-etl / ethereum-etl

Python scripts for ETL (extract, transform and load) jobs for Ethereum blocks, transactions, ERC20 / ERC721 tokens, transfers, receipts, logs, contracts, internal transactions. Data is available in Google BigQuery https://goo.gl/oY5BCQ
https://t.me/BlockchainETL
MIT License
2.87k stars 812 forks source link

Exporting receipts and logs fail for transaction hashes that have no transaction receipt #362

Open ancil-t opened 2 years ago

ancil-t commented 2 years ago

Thank you for this excellent package btw! I was wondering whether anyone can help me with this issue

Description

The export_receipts_and_logs command fails when it encounters a tx hash that has no tx receipt or event logs.

Verification

I've verified this by running the command against two files:

  1. a file (tx_hash_with_logs.txt) with just 0x37c5c1f28fca1638aac3ec7c0c1b5c1ebf8df18cd979e949d38d1627b0316610. This tx has does have a tx receipt and logs.
  2. a file (tx_hash_no_logs.txt) with just 0x5c504ed432cb51138bcf09aa5e8a410dd4a1e204ef84bfed1be16dfba1b22060. This tx hash contains no tx receipt and no logs.

1 works as expected but 2 emits some errors after several retries before the process tops.

I've also made RPC calls eth.getTransactionReceipt from the geth console with both tx hashes. One returns null and the other returns the tx receipt with logs. So we know this isn't a case of geth not being synced.

Expected

I expect export_receipts_and_logs to ignore tx hashes that return null and move on to the next tx hash.

Actual

What happens currently is it retries 4 times before the process stops.

2022-06-29 03:41:13,505 - root [ERROR] - An exception occurred while executing execute_with_retries. Retry #4
Traceback (most recent call last):
  File "/home/ubuntu/.local/lib/python3.10/site-packages/ethereumetl/executors/batch_work_executor.py", line 63, in _fail_safe_execute
    work_handler(batch)
  File "/home/ubuntu/.local/lib/python3.10/site-packages/ethereumetl/jobs/export_receipts_job.py", line 69, in _export_receipts
    receipts = [self.receipt_mapper.json_dict_to_receipt(result) for result in results]
  File "/home/ubuntu/.local/lib/python3.10/site-packages/ethereumetl/jobs/export_receipts_job.py", line 69, in <listcomp>
    receipts = [self.receipt_mapper.json_dict_to_receipt(result) for result in results]
  File "/home/ubuntu/.local/lib/python3.10/site-packages/ethereumetl/utils.py", line 71, in rpc_response_batch_to_results
    yield rpc_response_to_result(response_item)
  File "/home/ubuntu/.local/lib/python3.10/site-packages/ethereumetl/utils.py", line 82, in rpc_response_to_result
    raise RetriableValueError(error_message)
ethereumetl.misc.retriable_value_error.RetriableValueError: result is None in response {'jsonrpc': '2.0', 'id': 0, 'result': None}. Make sure Ethereum node is synced.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/ubuntu/.local/lib/python3.10/site-packages/ethereumetl/executors/batch_work_executor.py", line 104, in execute_with_retries
    return func(*args)
  File "/home/ubuntu/.local/lib/python3.10/site-packages/ethereumetl/jobs/export_receipts_job.py", line 69, in _export_receipts
    receipts = [self.receipt_mapper.json_dict_to_receipt(result) for result in results]
  File "/home/ubuntu/.local/lib/python3.10/site-packages/ethereumetl/jobs/export_receipts_job.py", line 69, in <listcomp>
    receipts = [self.receipt_mapper.json_dict_to_receipt(result) for result in results]
  File "/home/ubuntu/.local/lib/python3.10/site-packages/ethereumetl/utils.py", line 71, in rpc_response_batch_to_results
    yield rpc_response_to_result(response_item)
  File "/home/ubuntu/.local/lib/python3.10/site-packages/ethereumetl/utils.py", line 82, in rpc_response_to_result
    raise RetriableValueError(error_message)
ethereumetl.misc.retriable_value_error.RetriableValueError: result is None in response {'jsonrpc': '2.0', 'id': 0, 'result': None}. Make sure Ethereum node is synced.
Traceback (most recent call last):
  File "/home/ubuntu/.local/lib/python3.10/site-packages/ethereumetl/executors/batch_work_executor.py", line 63, in _fail_safe_execute
    work_handler(batch)
  File "/home/ubuntu/.local/lib/python3.10/site-packages/ethereumetl/jobs/export_receipts_job.py", line 69, in _export_receipts
    receipts = [self.receipt_mapper.json_dict_to_receipt(result) for result in results]
  File "/home/ubuntu/.local/lib/python3.10/site-packages/ethereumetl/jobs/export_receipts_job.py", line 69, in <listcomp>
    receipts = [self.receipt_mapper.json_dict_to_receipt(result) for result in results]
  File "/home/ubuntu/.local/lib/python3.10/site-packages/ethereumetl/utils.py", line 71, in rpc_response_batch_to_results
    yield rpc_response_to_result(response_item)
  File "/home/ubuntu/.local/lib/python3.10/site-packages/ethereumetl/utils.py", line 82, in rpc_response_to_result
    raise RetriableValueError(error_message)
ethereumetl.misc.retriable_value_error.RetriableValueError: result is None in response {'jsonrpc': '2.0', 'id': 0, 'result': None}. Make sure Ethereum node is synced.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/home/ubuntu/.local/lib/python3.10/site-packages/ethereumetl/__main__.py", line 26, in <module>
    cli()
  File "/home/ubuntu/.local/lib/python3.10/site-packages/click/core.py", line 1128, in __call__
    return self.main(*args, **kwargs)
  File "/home/ubuntu/.local/lib/python3.10/site-packages/click/core.py", line 1053, in main
    rv = self.invoke(ctx)
  File "/home/ubuntu/.local/lib/python3.10/site-packages/click/core.py", line 1659, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/ubuntu/.local/lib/python3.10/site-packages/click/core.py", line 1395, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/ubuntu/.local/lib/python3.10/site-packages/click/core.py", line 754, in invoke
    return __callback(*args, **kwargs)
  File "/home/ubuntu/.local/lib/python3.10/site-packages/ethereumetl/cli/export_receipts_and_logs.py", line 65, in export_receipts_and_logs
    job.run()
  File "/home/ubuntu/.local/lib/python3.10/site-packages/blockchainetl/jobs/base_job.py", line 30, in run
    self._end()
  File "/home/ubuntu/.local/lib/python3.10/site-packages/ethereumetl/jobs/export_receipts_job.py", line 81, in _end
    self.batch_work_executor.shutdown()
  File "/home/ubuntu/.local/lib/python3.10/site-packages/ethereumetl/executors/batch_work_executor.py", line 97, in shutdown
    self.executor.shutdown()
  File "/home/ubuntu/.local/lib/python3.10/site-packages/ethereumetl/executors/fail_safe_executor.py", line 39, in shutdown
    self._check_completed_futures()
  File "/home/ubuntu/.local/lib/python3.10/site-packages/ethereumetl/executors/fail_safe_executor.py", line 47, in _check_completed_futures
    future.result()
  File "/usr/lib/python3.10/concurrent/futures/_base.py", line 439, in result
    return self.__get_result()
  File "/usr/lib/python3.10/concurrent/futures/_base.py", line 391, in __get_result
    raise self._exception
  File "/usr/lib/python3.10/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/home/ubuntu/.local/lib/python3.10/site-packages/ethereumetl/executors/batch_work_executor.py", line 70, in _fail_safe_execute
    execute_with_retries(work_handler, [item],
  File "/home/ubuntu/.local/lib/python3.10/site-packages/ethereumetl/executors/batch_work_executor.py", line 104, in execute_with_retries
    return func(*args)
  File "/home/ubuntu/.local/lib/python3.10/site-packages/ethereumetl/jobs/export_receipts_job.py", line 69, in _export_receipts
    receipts = [self.receipt_mapper.json_dict_to_receipt(result) for result in results]
  File "/home/ubuntu/.local/lib/python3.10/site-packages/ethereumetl/jobs/export_receipts_job.py", line 69, in <listcomp>
    receipts = [self.receipt_mapper.json_dict_to_receipt(result) for result in results]
  File "/home/ubuntu/.local/lib/python3.10/site-packages/ethereumetl/utils.py", line 71, in rpc_response_batch_to_results
    yield rpc_response_to_result(response_item)
  File "/home/ubuntu/.local/lib/python3.10/site-packages/ethereumetl/utils.py", line 82, in rpc_response_to_result
    raise RetriableValueError(error_message)
ethereumetl.misc.retriable_value_error.RetriableValueError: result is None in response {'jsonrpc': '2.0', 'id': 0, 'result': None}. Make sure Ethereum node is synced.

Versions

ethereum-etl: 2.0.2
Python: 3.10.4
OS: Ubuntu 22.04 LTS
Geth: 1.10.19-stable-23bee162

Thank you

ancil-t commented 2 years ago

I even tried running the export_all.sh script by running the following

bash export_all.sh -s 9733023 -e 9733123 -b 10000 -p file://$HOME/.ethereum/geth.ipc -o output

I end up getting the same error

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/home/ubuntu/data/ethereum-etl/ethereumetl/__main__.py", line 26, in <module>
    cli()
  File "/home/ubuntu/.local/lib/python3.10/site-packages/click/core.py", line 1128, in __call__
    return self.main(*args, **kwargs)
  File "/home/ubuntu/.local/lib/python3.10/site-packages/click/core.py", line 1053, in main
    rv = self.invoke(ctx)
  File "/home/ubuntu/.local/lib/python3.10/site-packages/click/core.py", line 1659, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/ubuntu/.local/lib/python3.10/site-packages/click/core.py", line 1395, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/ubuntu/.local/lib/python3.10/site-packages/click/core.py", line 754, in invoke
    return __callback(*args, **kwargs)
  File "/home/ubuntu/data/ethereum-etl/ethereumetl/cli/export_receipts_and_logs.py", line 65, in export_receipts_and_logs
    job.run()
  File "/home/ubuntu/data/ethereum-etl/blockchainetl/jobs/base_job.py", line 30, in run
    self._end()
  File "/home/ubuntu/data/ethereum-etl/ethereumetl/jobs/export_receipts_job.py", line 81, in _end
    self.batch_work_executor.shutdown()
  File "/home/ubuntu/data/ethereum-etl/ethereumetl/executors/batch_work_executor.py", line 97, in shutdown
    self.executor.shutdown()
  File "/home/ubuntu/data/ethereum-etl/ethereumetl/executors/fail_safe_executor.py", line 39, in shutdown
    self._check_completed_futures()
  File "/home/ubuntu/data/ethereum-etl/ethereumetl/executors/fail_safe_executor.py", line 47, in _check_completed_futures
    future.result()
  File "/usr/lib/python3.10/concurrent/futures/_base.py", line 439, in result
    return self.__get_result()
  File "/usr/lib/python3.10/concurrent/futures/_base.py", line 391, in __get_result
    raise self._exception
  File "/usr/lib/python3.10/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/home/ubuntu/data/ethereum-etl/ethereumetl/executors/batch_work_executor.py", line 70, in _fail_safe_execute
    execute_with_retries(work_handler, [item],
  File "/home/ubuntu/data/ethereum-etl/ethereumetl/executors/batch_work_executor.py", line 104, in execute_with_retries
    return func(*args)
  File "/home/ubuntu/data/ethereum-etl/ethereumetl/jobs/export_receipts_job.py", line 69, in _export_receipts
    receipts = [self.receipt_mapper.json_dict_to_receipt(result) for result in results]
  File "/home/ubuntu/data/ethereum-etl/ethereumetl/jobs/export_receipts_job.py", line 69, in <listcomp>
    receipts = [self.receipt_mapper.json_dict_to_receipt(result) for result in results]
  File "/home/ubuntu/data/ethereum-etl/ethereumetl/utils.py", line 71, in rpc_response_batch_to_results
    yield rpc_response_to_result(response_item)
  File "/home/ubuntu/data/ethereum-etl/ethereumetl/utils.py", line 82, in rpc_response_to_result
    raise RetriableValueError(error_message)
ethereumetl.misc.retriable_value_error.RetriableValueError: result is None in response {'jsonrpc': '2.0', 'id': 0, 'result': None}. Make sure Ethereum node is synced.
2022-06-30 03:43:30 An error occurred. Quitting.

For some context, when I run the following, it exports the tx receipt and logs fine:

bash export_all.sh -s 14000000 -e 14000050 -b 1000000 -p file://$HOME/.ethereum/geth.ipc -o output
0xstochastic commented 1 year ago

Thanks @ancil-t for your bug report. I got the same problem here.

akuehlka commented 1 year ago

I'm facing the same problem here. Currently trying to sync an older version of geth to see if I can export these blocks. Anyone knows a workaround?

abhishektvz commented 1 year ago

I can confirm we started getting the same error after the merge once I updated to latest erigon alpha with lighthouse beacon node

robi00 commented 6 months ago

Has anyone solved this problem or knows how to do it?

megalepozy commented 5 months ago

Running the following commands actually succeed for me:

` root@a949388951fd:/ethereum-etl# ethereumetl export_blocks_and_transactions --start-block 46000 --end-block 46200 --provider-uri http://172.19.0.2:8545 --transactions-output /media/freeSpace/transactions_test.csv --max-workers 8 --batch-size 100 2024-01-10 17:25:24,739 - ProgressLogger [INFO] - Started work. Items to process: 201. 2024-01-10 17:25:24,752 - ProgressLogger [INFO] - 101 items processed. Progress is 50%. 2024-01-10 17:25:24,754 - ProgressLogger [INFO] - 201 items processed. Progress is 100%. 2024-01-10 17:25:24,754 - ProgressLogger [INFO] - Finished work. Total items processed: 201. Took 0:00:00.015087. 2024-01-10 17:25:24,754 - CompositeItemExporter [INFO] - block items exported: 0 2024-01-10 17:25:24,754 - CompositeItemExporter [INFO] - transaction items exported: 4

root@a949388951fd:/ethereum-etl# ethereumetl extract_csv_column --input /media/freeSpace/transactions_test.csv --column hash --output /media/freeSpace/transaction_hashes_test.txt

root@a949388951fd:/ethereum-etl# ethereumetl export_receipts_and_logs --transaction-hashes /media/freeSpace/transaction_hashes_test.txt --provider-uri http://172.19.0.2:8545 --receipts-output /media/freeSpace/receipts_test.csv --logs-output logs_test.csv --max-workers 12 2024-01-10 17:28:07,587 - ProgressLogger [INFO] - Started work. 2024-01-10 17:28:07,597 - ProgressLogger [INFO] - Finished work. Total items processed: 4. Took 0:00:00.010165. 2024-01-10 17:28:07,597 - CompositeItemExporter [INFO] - receipt items exported: 4 2024-01-10 17:28:07,597 - CompositeItemExporter [INFO] - log items exported: 0

root@a949388951fd:/ethereum-etl# ethereumetl export_receipts_and_logs --transaction-hashes /media/freeSpace/transaction_hashes_test.txt --provider-uri http://172.19.0.2:8545 --receipts-output /media/freeSpace/receipts_test.csv --logs-output /media/freeSpace/logs_test.csv --max-workers 12 2024-01-10 17:28:41,570 - ProgressLogger [INFO] - Started work. 2024-01-10 17:28:41,573 - ProgressLogger [INFO] - Finished work. Total items processed: 4. Took 0:00:00.003954. 2024-01-10 17:28:41,574 - CompositeItemExporter [INFO] - receipt items exported: 4 2024-01-10 17:28:41,574 - CompositeItemExporter [INFO] - log items exported: 0

root@a949388951fd:/ethereum-etl# cat /media/freeSpace/receipts_test.csv transaction_hash,transaction_index,block_hash,block_number,cumulative_gas_used,gas_used,contract_address,root,status,effective_gas_price,l1_fee,l1_gas_used,l1_gas_price,l1_fee_scalar 0x5c504ed432cb51138bcf09aa5e8a410dd4a1e204ef84bfed1be16dfba1b22060,0,0x4e3a3754410177e6937ef1f84bba68ea139e8d1a2258c5f85db9f1cd715a1bdd,46147,21000,21000,,,1,50000000000000,,,, 0x19f1df2c7ee6b464720ad28e903aeda1a5ad8780afc22f0b960827bd4fcf656d,0,0x5793f91c9fa8f824d8ed77fc1687dddcf334da81c68be65a782a36463b6f7998,46169,21000,21000,,,1,909808707606,,,, 0x9e6e19637bb625a8ff3d052b7c2fe57dc78c55a15d258d77c43d5a9c160b0384,0,0xf4a537e8e2233149929a9b6964c9aced6ee95f42131aa6b648d2c7946dfc6fe2,46170,21000,21000,,,1,500000000000,,,, 0xcb9378977089c773c074045b20ede2cdcc3a6ff562f4e64b51b20c5205234525,0,0x47ec6a0c3467850cf88112c212c262819de6f1d084d396981d18d8f949cb3017,46194,21000,21000,,,1,1000000000000,,,,

root@a949388951fd:/ethereum-etl# cat /media/freeSpace/logs_test.csv root@a949388951fd:/ethereum-etl# ` no logs but there is a receipt

boxhock commented 2 months ago

@megalepozy which Ethereum client are you running that's returning a receipt for 0x5c504ed432cb51138bcf09aa5e8a410dd4a1e204ef84bfed1be16dfba1b22060?

megalepozy commented 2 months ago

@boxhock Well I don't use ethereum-etl anymore (building something custom, maybe will publish at the end) but I just ran eth_getTransactionReceipt on an Erigon node that I have and it indeed returned a receipt without logs.

$ curl -X POST -H "Content-Type: application/json" --data '{"jsonrpc": "2.0", "method": "eth_getTransactionReceipt", "params":["0x5c504ed432cb51138bcf09aa5e8a410dd4a1e204ef84bfed1be16dfba1b22060"], "id":1}' erigon:8545 | jq

% Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 1175 100 1029 100 146 682k 99184 --:--:-- --:--:-- --:--:-- 1147k { "jsonrpc": "2.0", "id": 1, "result": { "blockHash": "0x4e3a3754410177e6937ef1f84bba68ea139e8d1a2258c5f85db9f1cd715a1bdd", "blockNumber": "0xb443", "contractAddress": null, "cumulativeGasUsed": "0x5208", "effectiveGasPrice": "0x2d79883d2000", "from": "0xa1e4380a3b1f749673e270229993ee55f35663b4", "gasUsed": "0x5208", "logs": [], "logsBloom": "0x00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000", "status": "0x1", "to": "0x5df9b87991262f6ba471f09758cde1c0fc1de734", "transactionHash": "0x5c504ed432cb51138bcf09aa5e8a410dd4a1e204ef84bfed1be16dfba1b22060", "transactionIndex": "0x0", "type": "0x0" } }

boxhock commented 2 months ago

@megalepozy Thanks for the information. I'm running Erigon as well, where it seems to be a known issue: ledgerwatch/erigon#3243 (which I just noticed you've already seen 😅 ).

curl -X POST -H "Content-Type: application/json" --data '{"jsonrpc": "2.0", "method": "eth_getTransactionReceipt", "params":["0x5c504ed432cb51138bcf09aa5e8a410dd4a1e204ef84bfed1be16dfba1b22060"], "id":1}' my_own_erigon_node | jq
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   185  100    39  100   146    136    510 --:--:-- --:--:-- --:--:--   649
{
  "jsonrpc": "2.0",
  "id": 1,
  "result": null
}

As for the original issue @ancil-t:

I expect export_receipts_and_logs to ignore tx hashes that return null and move on to the next tx hash.

I would argue that the current behavior from ethereum-etl (not Erigon/Geth) is correct, and we should instead help push the Ethereum client devs to fix their clients (see linked issue above). For exporting with ethereum-etl, you might be better off falling back to a different client/provider.

megalepozy commented 2 months ago

@boxhock I knew I recognized u from somewhere! btw that Erigon ticket is making me worried about problems in my data set... at least I see that they raised the priority to medium