cartesi / rollups

Cartesi Rollups
30 stars 12 forks source link

feat(rollups-events): connect to Redis via TLS #62

Closed torives closed 1 year ago

torives commented 1 year ago

Closes #13

gligneul commented 1 year ago

The CI is raising an error:

thread 'test_it_connects_via_tls' panicked at 'failed to docker_run command 'docker build -f tests/tls.Dockerfile -t cartesi/broker-tls-test ..'
Error response from daemon: Error processing tar file(exit status 1): write /target/debug/deps/integration-4f7084f3519ed76b: no space left on device
', test-fixtures/src/docker_cli.rs:55:5

Maybe too much is being copied to the docker context?

GMKrieger commented 1 year ago

When I tried to run the tests locally, I came across this error

---- test_it_connects_via_tls stdout ----
thread 'test_it_connects_via_tls' panicked at 'failed to docker_run command 'docker build -f tests/tls.Dockerfile -t cartesi/broker-tls-test ..'
Error response from daemon: dockerfile parse error line 28: unknown instruction: DEBIAN_FRONTEND="NONINTERACTIVE"
', test-fixtures/src/docker_cli.rs:55:5
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

Briefly looking online, I saw that setting debian_frontend to noninteractive is highly discouraged (https://bobcares.com/blog/debian_frontendnoninteractive-docker/ or https://github.com/moby/moby/issues/4032#issuecomment-34597177). It's best to look into this and remove it, if possible.

torives commented 1 year ago

Maybe too much is being copied to the docker context?

@gligneul The test binary depends on the redacted crate, so I ended up using the offchain directory as the build context. I suppose it is possible to get a slimmer context using multiple build contexts, but I'm not sure. I'll take a look.

gligneul commented 1 year ago

@gligneul The test binary depends on the redacted crate, so I ended up using the offchain directory as the build context. I suppose it is possible to get a slimmer context using multiple build contexts, but I'm not sure. I'll take a look.

@torives, you are probably copying the target directory in offchain. There is a docker ignore file in the project's root that avoids that, but in this case, the context is the project's root, not the offchain dir.

torives commented 1 year ago

When I tried to run the tests locally, I came across this error


---- test_it_connects_via_tls stdout ----

thread 'test_it_connects_via_tls' panicked at 'failed to docker_run command 'docker build -f tests/tls.Dockerfile -t cartesi/broker-tls-test ..'

Error response from daemon: dockerfile parse error line 28: unknown instruction: DEBIAN_FRONTEND="NONINTERACTIVE"

', test-fixtures/src/docker_cli.rs:55:5

note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

Briefly looking online, I saw that setting debian_frontend to noninteractive is highly discouraged (https://bobcares.com/blog/debian_frontendnoninteractive-docker/ or https://github.com/moby/moby/issues/4032#issuecomment-34597177). It's best to look into this and remove it, if possible.

Interesting 🤔. We're actually using it in the way what both references recommended: in-line. Also, we do the same thing in our main offchain/Dockerfile. Could you please check if it also fails to build in your machine?

gligneul commented 1 year ago

When I tried to run the tests locally, I came across this error

@GMKrieger I ran the test on my machine, and it passed. This issue seems out of scope, could you create another issue for it?

endersonmaia commented 1 year ago

I tested it with Amazon MemoryDB for Redis, and I get this error.

2023-05-31T19:29:18.291528Z  INFO dispatcher::main_loop: Setting up dispatcher with config: DispatcherConfig { sc_config: SCConfig { grpc_endpoint: "http://localhost:50051", default_confirmations: 1 }, tx_config
: TxManagerConfig { default_confirmations: 2, provider_http_endpoint: "https://eth-goerli.g.alchemy.com/v2/4eeMGl2u3ALzBEsohk3IuvNxSIPb0r0U", chain_id: 5, chain_is_legacy: false, wallet_address: 0x18930e8a66a1db
e21d00581216789aab7460afd0, database_path: "./default_tx_database", gas_oracle_api_key: "" }, broker_config: BrokerConfig { redis_endpoint: rediss://clustercfg.rollups.gp1e3k.memorydb.us-east-1.amazonaws.com, co
nsume_timeout: 5000, backoff: ExponentialBackoff { current_interval: 500ms, initial_interval: 500ms, randomization_factor: 0.5, multiplier: 1.5, max_interval: 60s, start_time: Instant { tv_sec: 800439, tv_nsec: 
291408075 }, max_elapsed_time: Some(120s), clock: SystemClock } }, hc_config: HealthCheckConfig { host_address: "0.0.0.0", port: 8081 }, dapp_deployment: DappDeployment { dapp_address: 0xb4ab123a8ebd053a21cc4ffb
f3aab547353ec740, deploy_block_hash: 0xfef96172ba52c319aae8e41cc1682c9018e86e168e08047d79eab35bf00772d9 }, rollups_deployment: RollupsDeployment { history_address: 0xfb34dadd5f3aba55004c448acfd24faf5a4ff83e, aut
hority_address: 0x35ec7d70caff883844d94b54cd19634ffab5d8cc, input_box_address: 0x5a723220579c0dcb8c9253e6b4c62e572e379945 }, epoch_duration: 86400 }                                                               
2023-05-31T19:29:18.294678Z  INFO dispatcher::http_health: Starting dispatcher health check endpoint at http://0.0.0.0:8081/healthz                                                                                
2023-05-31T19:30:58.538315Z ERROR rollups_dispatcher: Dispatcher stopped: Ok(Err(error peeking at the end of the stream                                                                                            

Caused by:                                                                                                                                                                                                         
    0: error connecting to Redis                                                                                                                                                                                   
    1: An error was signalled by the server: 9278 rollups-0001-001.rollups.gp1e3k.memorydb.us-east-1.amazonaws.com:6379))                                                                                          
Error: error peeking at the end of the stream                                                                                                                                                                      

Caused by:                                                                                                                                                                                                         
    0: error connecting to Redis                                                                                                                                                                                   
    1: An error was signalled by the server: 9278 rollups-0001-001.rollups.gp1e3k.memorydb.us-east-1.amazonaws.com:6379                                                                                            

For some reason, it worked for sometime, and then started to fail, as we can see from the following logs:

2023-05-31T19:26:24.573671Z  INFO dispatcher::main_loop: Setting up dispatcher with config: DispatcherConfig { sc_config: SCConfig { grpc_endpoint: "http://localhost:50051", default_confirmations: 1 }, tx_config
: TxManagerConfig { default_confirmations: 2, provider_http_endpoint: "https://eth-goerli.g.alchemy.com/v2/4eeMGl2u3ALzBEsohk3IuvNxSIPb0r0U", chain_id: 5, chain_is_legacy: false, wallet_address: 0x18930e8a66a1db
e21d00581216789aab7460afd0, database_path: "./default_tx_database", gas_oracle_api_key: "" }, broker_config: BrokerConfig { redis_endpoint: rediss://clustercfg.rollups.gp1e3k.memorydb.us-east-1.amazonaws.com, co
nsume_timeout: 5000, backoff: ExponentialBackoff { current_interval: 500ms, initial_interval: 500ms, randomization_factor: 0.5, multiplier: 1.5, max_interval: 60s, start_time: Instant { tv_sec: 800265, tv_nsec:
573564735 }, max_elapsed_time: Some(120s), clock: SystemClock } }, hc_config: HealthCheckConfig { host_address: "0.0.0.0", port: 8081 }, dapp_deployment: DappDeployment { dapp_address: 0xb4ab123a8ebd053a21cc4ffb
f3aab547353ec740, deploy_block_hash: 0xfef96172ba52c319aae8e41cc1682c9018e86e168e08047d79eab35bf00772d9 }, rollups_deployment: RollupsDeployment { history_address: 0xfb34dadd5f3aba55004c448acfd24faf5a4ff83e, aut
hority_address: 0x35ec7d70caff883844d94b54cd19634ffab5d8cc, input_box_address: 0x5a723220579c0dcb8c9253e6b4c62e572e379945 }, epoch_duration: 86400 }
2023-05-31T19:26:24.574683Z  INFO dispatcher::http_health: Starting dispatcher health check endpoint at http://0.0.0.0:8081/healthz
2023-05-31T19:26:25.224811Z  INFO dispatcher::machine::rollups_broker: finishing epoch inputs_sent_count=0
2023-05-31T19:26:25.229464Z  INFO dispatcher::machine::rollups_broker: enqueueing input input_index=0 input=Input { sender: 0x18930e8a66a1dbe21d00581216789aab7460afd0, payload: [72, 101, 108, 108, 111, 32, 116,
111, 32, 101, 99, 104, 111, 45, 112, 121, 116, 104, 111, 110], block_added: Block { hash: 0x2cdfa0759fcae3c36b149e4cf193494c3ae86e18b990eadbbe3d21b1c218151c, number: 9045142, parent_hash: 0x655a92c983988725320f4
06e27c81ee03451d7792bb3e701f93dcb9535179212, timestamp: 1684777116, logs_bloom: 0x00b608438221408c20900c00820000cc100808002140004a210d0932a901884300223500000042040000600e011400c480230572025260200213160d80e420601
d1042201d72a0a8480a010b022a822064550020814a38008090d884c280000044100a0106028100804c18a6401bc800020080400a2c455400a440100008100053062060504c100a20400000a221020240058693101116b800c50140145360040e28100c304082100201
0002a208080602091200a2261f880182032630200207d1f04a2311020a0904400041851b4e0c0448d000020082123001602a4d02214800903a2004641225940801202000005b0e11141209110241402b400800080401 }, dapp: 0xb4ab123a8ebd053a21cc4ffbf3a
ab547353ec740, tx_hash: 0xb0c4cdb9d4dd3b7ba96803a9289d7daa8b81f951b2f25e58927f6862c33f3fdc }
2023-05-31T19:26:25.234100Z  INFO dispatcher::machine::rollups_broker: enqueueing input input_index=1 input=Input { sender: 0x18930e8a66a1dbe21d00581216789aab7460afd0, payload: [72, 101, 108, 108, 111, 32, 116,
111, 32, 101, 99, 104, 111, 45, 112, 121, 116, 104, 111, 110], block_added: Block { hash: 0x7376738028c8a2a6da67d3231acaa70e73ffb24b6e478fda8e9a95ef01a5e597, number: 9045344, parent_hash: 0x59a3c6b2df835c4fafbc7
728ef7c5f8fef61ec979f9d5f1e4109d33207ad7cc5, timestamp: 1684780116, logs_bloom: 0x0036004e4222644c021004009c11008c320a80003140000a018b0612ec19800240220100080013001000080e0504408462000010c240302002001604802422401
08240320cf00008c800000a008a02222065050004473840841090008080002100041d00a2060000804810a400188800080000c040a841400020621200081040130220e01044800e00408804040820261004000100108688008509681142c1000620010e3d0002065032
8002a0840400e0400209a00403040190810601a042039134502b8000440c00022041050e4444008d101000c80211220330004d02610010121020002012048804012000040800000010008a1102414082404002006303 }, dapp: 0xb4ab123a8ebd053a21cc4ffbf3a
ab547353ec740, tx_hash: 0x397eaf44987ebe8a2aa16db8293487fa07ca5c7d42fd48343a319c1e4f86375c }
2023-05-31T19:26:25.239441Z  INFO dispatcher::machine::rollups_broker: enqueueing input input_index=2 input=Input { sender: 0x18930e8a66a1dbe21d00581216789aab7460afd0, payload: [72, 101, 108, 108, 111, 32, 116,
111, 32, 101, 99, 104, 111, 45, 112, 121, 116, 104, 111, 110], block_added: Block { hash: 0xf3e9e2a87121293086e1c853138f83216221e4613461ae8e38af47ae056369db, number: 9045364, parent_hash: 0x9e1d102c7eb0669e90399
c5ee87e4c66ea5f4d2f5b667511a863b8d867c7adf3, timestamp: 1684780428, logs_bloom: 0x04072840022066ac30940210840700dc100c8f447c44006405002412c8190812c4a20520000040214002001e110420c4407002108200403800109c04422422603
0124032087000145820520a900e120440450411900238008814c0850080426145008d0103020140804804b404180800100040024028456000a0501820001842932a20601094806a0400800080205202420080008210078000058100184760804624000e38211b814300
5986600200002144a703a004834003804127a2622302b135216b0900200a007620c1051364c4088a100140804310240334240d02290010101161302611468004033028004002101922001a9003620202d0088a205280 }, dapp: 0xb4ab123a8ebd053a21cc4ffbf3a
ab547353ec740, tx_hash: 0x51925dbefcc3ae72b4ee314585a9e997a476f0f01888898437c71adc0701e1ee }
2023-05-31T19:26:25.243757Z  INFO dispatcher::machine::rollups_broker: finishing epoch inputs_sent_count=3
2023-05-31T19:29:01.590675Z  INFO dispatcher::drivers::blockchain: Sending claim `RollupsClaim { epoch_index: 1, epoch_hash: b6e446f15dfe1cd939e0b7c74af4d9c0930f80a2cdcb69196c35445994dc1747, first_index: 0, last_ind
ex: 2 }`
2023-05-31T19:29:01.684224Z  WARN eth_tx_manager::manager: Gas oracle has failed and/or is defaulting to the provider (defaulting).
torives commented 1 year ago

@endersonmaia just to check: is redis_endpoint: rediss://clustercfg.rollups.gp1e3k.memorydb.us-east-1.amazonaws.com a single instance or a cluster as the URL seems to imply? Because the client does not support clusters yet.

endersonmaia commented 1 year ago

@endersonmaia just to check: is redis_endpoint: rediss://clustercfg.rollups.gp1e3k.memorydb.us-east-1.amazonaws.com a single instance or a cluster as the URL seems to imply? Because the client does not support clusters yet.

It's a cluster. But looks like it worked, right?

I will try to connect to a specific node of the cluster, and see if it changes something, and report here.

endersonmaia commented 1 year ago

@endersonmaia just to check: is redis_endpoint: rediss://clustercfg.rollups.gp1e3k.memorydb.us-east-1.amazonaws.com a single instance or a cluster as the URL seems to imply? Because the client does not support clusters yet.

It's a cluster. But looks like it worked, right?

I will try to connect to a specific node of the cluster, and see if it changes something, and report here.

I can confirm the error didn't show when I connect to a specific Redis node instead of the cluster URL.

endersonmaia commented 1 year ago

After sometime running connected to a single node from the cluster, I get this error on rollups-indexer:

2023-05-31T20:57:55.607051Z ERROR indexer: BrokerError { source: ConnectionError { source: An error was signalled by the server: Keys in request don't hash to the same slot } }                                   

I think we shouldn't be using a connection to a node that's part of a cluster. So I'll stop testing with AWS, until we have both TLS and cluster available.

What do you think?

gligneul commented 1 year ago

What do you think?

@endersonmaia I think we can move forward with this PR and then implement the cluster support in another issue. Is that ok?