Open raulk opened 4 years ago
make tiny deals, e.g. 16-byte deals
This is not possible, due to various cryptographic construction limits, the minimum piece
is 127 bytes long: https://github.com/filecoin-project/rust-fil-proofs/issues/1231#issuecomment-663915253
The rest looks great!
@ribasushi thanks for the remark.
An idea is to make the locally miner produce 7,700 16-byte values randomly, store that data where the offline deal flow is supposed to find them, and advertise their CIDs (or CommP values, I guess?) via the sync service to the client.
You can use deterministic pseudorandom data generated by https://github.com/jbenet/go-random/blob/master/lib.go#L16 This way all you need to transfer is a single nonce plus the amount of increments of that nonce ( i.e. 2 values )
We use this code heavily in go-ipfs testing: https://github.com/ipfs/go-ipfs/blob/777d306f6e66e31a05f43a337bce272050407386/test/sharness/t0082-repo-gc-auto.sh#L20-L24
do we also need to test whether theres any scalability issues for a client to query for deals with multiple miners? @ribasushi you may have further thoughts on whether this is something that's being looked at elsewhere
do we also need to test whether theres any scalability issues for a client to query for deals with multiple miners?
@jnthnvctr I think it doesn't matter at this stage...
@ribasushi, @jnthnvctr I've made a start on this (will push soon), and just want to check in with you guys about what specifically we're measuring.
As I understand it, we're mostly worried about ClientListDeals
falling over if a client has too many active deals. In other words, the specifics of the deal (whether it's online or offline, etc) don't matter as much as the total number of deals per client. Is this correct?
I've made a start on the offline deal flow, but may switch to online deals if the distinction isn't important.
@yusefnapora I proposed on slack that we meet for 10 mins to sync up via a higher bandwidth channel. If you do not have availability I will try to form thought as a comment here.
thanks for the sync meeting @ribasushi & @jnthnvctr. I just wanted to summarize here:
The distinction between online and offline doesn't really matter in this case because the size of the data we're working with is small, and the transfer doesn't add any real overhead to the test.
Testing with small sectors and small deals is fine, since the same amount of chain state is produced regardless of the size of the data.
We want to find the upper bound on the number of active deals a client can propose and track.
We also want to find the upper limit on the miner's side, but we don't control what hardware the miners will be using.
We should probably run with the production pre-commit delay, so that sealing doesn't kick in right away.
So far I've tried proposing 8000 deals using the offline deal flow, and the client can fetch them with ClientListDeals
without any issues. I hit gas limit errors on the miner side when trying to activate the deals though:
Jul 28 17:07:19.458837 INFO 61.9329s ERROR << miners[000] (0d9cc7) >> 2020-07-28T17:07:19.457Z WARN vm vm/runtime.go:144 VM.Call failure: not enough gas: used=14966134, available=14966134 (RetCode=7): {"req_id": "04bdfdc3"}
Jul 28 17:07:19.458928 INFO 61.9331s ERROR << miners[000] (0d9cc7) >> github.com/filecoin-project/lotus/chain/vm.(*Runtime).chargeGasInternal {"req_id": "04bdfdc3"}
Jul 28 17:07:19.458961 INFO 61.9331s ERROR << miners[000] (0d9cc7) >> /go/pkg/mod/github.com/filecoin-project/lotus@v0.4.3-0.20200727232759-291d2fe2ded7/chain/vm/runtime.go:555 {"req_id": "04bdfdc3"}
Jul 28 17:07:19.458976 INFO 61.9331s ERROR << miners[000] (0d9cc7) >> 2020-07-28T17:07:19.457Z WARN vm vm/runtime.go:367 vmctx send failed: to: t04, method: 4: ret: [], err: not enough gas: used=14966134, available=14966134 (RetCode=7) {"req_id": "04bdfdc3"}
Jul 28 17:07:19.458990 INFO 61.9331s ERROR << miners[000] (0d9cc7) >> 2020-07-28T17:07:19.457Z WARN vm vm/runtime.go:315 Abortf: failed to enroll cron event {"req_id": "04bdfdc3"}
Jul 28 17:07:19.459038 INFO 61.9332s ERROR << miners[000] (0d9cc7) >> 2020-07-28T17:07:19.457Z WARN vm vm/runtime.go:144 VM.Call failure: failed to enroll cron event (RetCode=7): {"req_id": "04bdfdc3"}
Going to try adjusting the BlockGasLimit as suggested by @Kubuxu on slack and see if that helps.
FWIW, the error happens at the deal StartEpoch, after the offline data has been successfully imported. Both the miner and the client have > 100M FIL in their wallets, so I don't think either is strapped for cash :)
My plan for today is to try to figure out the gas limit so the deals can succeed. If I can't figure that out easily, I'll just push the StartEpoch far into the future so they don't fail and just keep proposing deals until the client chokes.
I'll just push the StartEpoch far into the future so they don't fail and just keep proposing deals until the client chokes.
This is actually the correct approach, as the current design calls for deal proposals to start 2 months in the future.
Describe the test scenario.
We want to test what level of concurrency and volume a Lotus deal proposer (client only, no mining) is able to withstand. The results will help us determine the scalability of the dumbo drop client pool we need to operate to materialise these deals in the network.
More concretely, this is a stress test for testing deal proposal, management, monitoring from the client's perspective. For this particular test, testing the miner is not required. Ideally we'd be able to mock it, but that could prove very difficult.
Instead, we opt to take the following solution to isolate the miner's and chain's scalability from the test:
In order to simulate the end-to-end process with as a high degree of fidelity, we would use the offline deal circuit.
An idea is to make the locally miner produce 7,700 16-byte values randomly, store that data where the offline deal flow is supposed to find them, and advertise their CIDs (or CommP values, I guess?) via the sync service to the client. The client would then make offline deals for those CIDs.
Provide any background and technical implementation details.
See above; since we're looking to find the boundary of a single client by stressing it, just a 1-client, 1-miner setup should be sufficient.
What should we measure?
Discomfort factor (0-10).
10 (owners: @jnthnvctr and @ribasushi) .
Additional remarks.