blockchain-etl / ethereum-etl

Python scripts for ETL (extract, transform and load) jobs for Ethereum blocks, transactions, ERC20 / ERC721 tokens, transfers, receipts, logs, contracts, internal transactions. Data is available in Google BigQuery https://goo.gl/oY5BCQ
https://t.me/BlockchainETL
MIT License
2.94k stars 841 forks source link

Google Pub/Sub attribute size exceeded #472

Open diega opened 8 months ago

diega commented 8 months ago

I'm attempting to stream Ethereum Classic traces to Google Pub/Sub using v2.3.1, but when the process reaches block 1431916 (before the DAO fork, i.e. existing in ETH too) I get the following error:

google.api_core.exceptions.InvalidArgument: 400 The attribute "item_id" in the request has a value that is too long. The length is 1993 characters, but the maximum allowed is 1024. Refer to https://cloud.google.com/pubsub/quotas for more information.

I'm able to reproduce the behavior using this set of options

stream
--provider-uri=http://localhost:8545
-e trace
--batch-size 1
--block-batch-size 1
--output projects/<your_project>/topics/<your_topic>
--start-block 1431916

Following the documentation quoted in the error, it states that the maximum size for any attribute is 1024 bytes so stringifying the traceAddress to set the trace_id exceeds the limit.

For the record, here's the json response to the trace_block call.

If you query the public Big Query tables from Google, you can see that these traces are properly stored, so I suspect they are not filling the tables using Pub/Sub.