omgnetwork / omg-childchain-v2

pronounced /Ch-ch/
Apache License 2.0
5 stars 2 forks source link

store transaction information in the cloud #174

Closed InoMurko closed 3 years ago

InoMurko commented 3 years ago

The idea is to keep childchain PG data lean and upload json blocks with transactions to a CDN. The childchain then redirects the block.get requests to the CDN.

Context: After a block is formed and submitted to the contracts, all watchers in the network request for block data, via block.get where the argument to the request is the block hash.

The childchain responds with all transactions included in the block, so that watchers are able to apply these transactions on their own ledgers, and spend inputs and produce outputs.

If we don't offload this transaction information, we need to keep them on the childchain side (permanently) and stress the application. The http responses are very suboptimal (large jsons).

InoMurko commented 3 years ago

Would like to hear opinions on potential drawbacks. Security, infra.

InoMurko commented 3 years ago

One aspect @achiurizo pointed out is the block witholding problem. If watchers are syncing and the childchain is somehow offline, block data is still reachable.

Well, perhaps not. The CDN url still needs to be constructed.

mikebryant commented 3 years ago

What's the issue with the block data being reachable but childchain offline?

mikebryant commented 3 years ago

CDN

pedant: Probably not a CDN? Mostly those are about caching content presented by an app. If we wanted to do that, it would still need to be hosted in the app and presenting caching headers, then cloudflare could cache it and avoid hitting the app once cached.

I guess you're thinking cloud storage, like GCS? A reasonable way of approaching it, though we still pay all the costs.

An interesting option would be to host it all using something like IPFS (so we do need to pin it all in a local node to ensure availability, but then it is distributed.). However, how often do historical blocks need to be fetched, is this a large amount of data, would that save us a lot of money - it would definitely increase complexity.

InoMurko commented 3 years ago

If a block is published to Ethereum, Watchers know that and they ask the childchain for block data. If childchain does not return block data it declares the chain byzantine and urges it's user to exit the funds from the plasma chain. Now that's very high level and to the point because the watcher will retry to get the block data for a while (configurable) before it throws a byzantine event.

But if we offload the block data to a CDN, the block information is as accessible as the CDN itself. If we somehow manage to construct the CDN url...

mikebryant commented 3 years ago

Considering GCS - we can make a bucket accessible just to the childchain for writing with public read. Security is roughly the same as it is now in the app. Infra complexity of this is minimal (especially with the cloud connector I'm just adding, we can provision the bucket from the helm chart if we want)

InoMurko commented 3 years ago

Siri, replace every occurrence of CDN with GCS storage or AWS S3 :D

IPFS would be a considerably more problematic approach to do, not sure how permanent it is. Infura also provides Filecoin API, that's an option as well. But both IPFS and Filecoin have extra costs and complexity included.

mikebryant commented 3 years ago

But if we offload the block data to a CDN, the block information is as accessible as the CDN itself. If we somehow manage to construct the CDN url...

But here the watcher is uncompromised (it's under the control of a user, for example). The watcher should just rely on childchain responding with the redirect to the block url? It might be technically possible to find the blocks, but that's not a concern I would think?

mikebryant commented 3 years ago

IPFS would be a considerably more problematic approach to do, not sure how permanent it is.

In this option I would propose running a node ourselves, or using an existing provider. In either case it's as permanent as we want it to be. As long as we keep hosting the data or paying for it it will be available. The question is simply if we bundle an ipfs node with all the watchers, will them hosting it too reduce our costs enough to make the complexity worthwhile.

It feels like this data is big and presents a problem, given the context of wanting the app to not host it, which is why it comes to my mind, but I don't know the actual numbers here

InoMurko commented 3 years ago

For every watcher that joins the network, the watcher starts the sync at block 0. And goes through all the blocks. And all Watchers try and get the block data from the Childchain. Thats a lot of data. And you need to hold it indefinitely.

Example (two transactions):

curl -X POST -H "Content-Type: application/json" -d '{"hash": "0x2c3069d4916be54c9e8d26a6a3b8d6b1f482fc486d2e34f714a997fc9ecf57ed"}' https://childchain.mainnet.v1.omg.network/block.get | jq
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  1350  100  1272  100    78   1370     84 --:--:-- --:--:-- --:--:--  1453
{
  "data": {
    "blknum": 3835000,
    "hash": "0x2c3069d4916be54c9e8d26a6a3b8d6b1f482fc486d2e34f714a997fc9ecf57ed",
    "transactions": [
      "0xf901baf886b84175b760795e3e833f3bcdbff07a5b654ed2fcb34c23edefec9fee6877cac847827060a25aa73af62b267d4b0e4d684aef2cc470d114d66a0c96ae929c90fd4b4b1bb84175b760795e3e833f3bcdbff07a5b654ed2fcb34c23edefec9fee6877cac847827060a25aa73af62b267d4b0e4d684aef2cc470d114d66a0c96ae929c90fd4b4b1b01f842a0000000000000000000000000000000000000000000000000000d9e17db6c9000a0000000000000000000000000000000000000000000000000000d9f00b011a000f8c9f001ee94c471bc549d029293ba68e70278d22ed9cb8fb6c894dac17f958d2ee523a2206206994597c13d831ec783061a80f001ee94c471bc549d029293ba68e70278d22ed9cb8fb6c894dac17f958d2ee523a2206206994597c13d831ec783030d40f001ee9420c2a8d894de2959575a858aadc29f32b52e1b4d94dac17f958d2ee523a2206206994597c13d831ec78301fbd0f501f39420c2a8d894de2959575a858aadc29f32b52e1b4d94d26114cd6ee289accf82350c8d8487fedb8a0c07880ce270a46d054cb080a00000000000000000000000000000000000000000000000000000000000000000",
      "0xf859c003f5f402f2948db1acea6c904955bb49afc3824131aeedb0322d94d26114cd6ee289accf82350c8d8487fedb8a0c0787fe460f3a5eb350a05c7099f7ae95a0c2f29f4803216cec23cea0c6da62e16619e85223d78b3360f3"
    ]
  },
  "service_name": "child_chain",
  "success": true,
  "version": "1.0.5+111da68"
}

And the content of the data node is static.

mikebryant commented 3 years ago

Thanks.

I'd definitely be comfortable bumping to GCS etc. And if we end up finding cost is an issue we want to solve better - embedded IPFS nodes in watchers would be a nice way of spreading the burden around (though we would still need to keep it in GCS etc as a backstop for our node, but at least might not be accessed as much

And GCS sounds quite a simple way of moving forward.

boolafish commented 3 years ago

Potentially we can let the requester pays: https://cloud.google.com/storage/docs/requester-pays

But....it need the requester to have billing account set. probably not really feasible

InoMurko commented 3 years ago

interesting source of revenue! :D

edit: oh, I guess this doesn't go into our credit card, lol

thec00n commented 3 years ago

I like the idea of publishing blocks to IPFS and Filecoin because of the liveness properties of Plasma. This could enable user to exit their funds even if all watchers go offline.

0x234 commented 3 years ago

This conversation seems to come around every 6 months or so 😂. Let me add an argument against this so we have a balanced discussion.

Serving the blocks via /block.get from anywhere other than the Childchain moves the trust boundary from the service to the caching layer. Currently requests served via the Childchain are trusted because through how we run our services we have confidence that the Childchain's data store has not been accessed, and the block data served is an accurate representation of the history of that deployment.

Moving to $CACHE, be it Cloudflare, IPFS, or a storage bucket, means that we serve a JSON blob without the ability to validate the integrity of the data. If someone gains access to $CACHE then they could modify the block data and include or withhold transactions that could lead to a Byzantine event. So there's a data consistency problem introduced that we now have to resolve.

Serving requests directly from the Childchain using a storage layer with a strong guarantee of consistency like Postgres gives us history, access control, redundancy, and support from GCP. Plus we have more options for scaling if/when that need arises.

On a slightly different note: how much of a problem is this? Do we have data on how this is working in production now? Is this a detriment to operation now? How does it affect our ability to serve transactions and keep the network secure? I'm open to the idea of architecture changes if there's data that's driving the need to undergo the work.

mikebryant commented 3 years ago

Moving to $CACHE, be it Cloudflare

We already trust cloudflare for external users, right? Cloudflare is in the access path to /block.get, so if someone were to compromise it they could provide different data for those calls.

IPFS

On this one, I think we would have integrity protection - if we were to, for example, return some sort of ipfs link redirect from the childchain api, it would be a content-addressable hash of the json blob, so a hash collision would be needed to replace that data. The identity of the data is the validation already, as it were.

how much of a problem is this?

This is definitely the more important point - how much data do we need to serve?, where do we anticipate running up against scaling limits with the current solution?

InoMurko commented 3 years ago

I like the idea of publishing blocks to IPFS and Filecoin because of the liveness properties of Plasma. This could enable user to exit their funds even if all watchers go offline.

I don't quite understand this comment. I think everyone can still exit regardless of Watchers being online. Re Filecoin https://twitter.com/peter_szilagyi/status/1329504370262761480. I don't know what liveness properties IPFS has and what the incentives (and integration details for it are). If we are seriously considering one or the other, anyone, pls provide some more information what the benefits/costs are.

We could provide integrity of data being served from the cache. If we keep SHA of the json blob on the childchain, we can construct the URL redirect to GCP and let the client handle the *.json download. If it goes corrupt, we know it was breached. Lots of data is being served from GCP. Is the argument that we don't have confidence in how GCP works or their security? I think the security is similar as with everything else, no? Service account -> write permissions.

On a slightly different note: how much of a problem is this? Do we have data on how this is working in production now? Is this a detriment to operation now? How does it affect our ability to serve transactions and keep the network secure? I'm open to the idea of architecture changes if there's data that's driving the need to undergo the work.

Production does not give us enough information at the moment. All blocks are empty, there are no (okay, there are some, but the network is inactive, so we don't know how much pressure a couple of watchers joining/resyncic would cause to the childchain v1) watchers, database is different :D.

Now the impact: A hypothetical full block of block.get - the size a single block is 57.8MB. If we ever have traffic that's supposed to give us revenue, we have to think about how this data is going to be served. The current solution isn't scalable. A full block json file example: https://gofile.io/d/kUV0g5

Is this a detriment to operation now? How does it affect our ability to serve transactions and keep the network secure?

The idea is to keep the childchain transaction processing engine as lean as possible, we don't want to keep and maintain data that we don't need for transaction processing.

This is a WIP, as we get the chch2 rolling out, we'll see what the performance impact is - but it seems like it could be detrimental. There's also the auditing question that's still open.

InoMurko commented 3 years ago

So yeah, servig 60MB blocks won't work. We need to compress it too :)))) https://www.lucidchart.com/techblog/2019/12/06/json-compression-alternative-binary-formats-and-compression-methods/

I pointed to @thec00n about this a while ago here: https://omisego.atlassian.net/wiki/spaces/SEC/pages/898859009/Block+pre-validation+overview

InoMurko commented 3 years ago

As there were no further feedbacks I will take this as an accepted proposal and outline an approach in a separate issue. Thanks!