Closed hsienfu closed 4 years ago
Thanks for the issue!
We should transfer this to rust-fil-proofs.
The same error CPU: AMD Ryzen Threadripper 3970X 32-Core Processor MEM: 256G 2133MHZ GPU: GeForce RTX 2080 Ti SWAP: 128G NVMe OS: Ubuntu 18.04 server
我碰到了和你一样的错误,能加个微信沟通下么,18221352583 @hsienfu
After a lot of debugging over Slack (https://filecoinproject.slack.com/archives/CPFTWMY7N/p1593680700430800) I have concluded that this error probably means that either the groth parameters or verifying key are corrupted. The first thing you should do to check this is rerun the paramfetch
program and ensure you have the verified-correct versions of these files. If the problem recurs, take note to see if they are being corrupted and need to be replaced again. It seems that @moonlight233 (for example) is seeing repeated corruption for some unknown reason.
As far as I can tell, this is not an issue with the proofs code but rather with whatever underlying system problem is leading to corruption of these parameters/keys.
I have checked proof-parameters
files and still commit check error: invalid proof (compute error?)
How to verifying key are corrupted ?
lotus fetch-params 32GiB
2020-07-08T16:34:11.889+0800 INFO build go-paramfetch@v0.0.2-0.20200701152213-3e0f0afdc261/paramfetch.go:138 Parameter file /var/tmp/filecoin-proof-parameters/v27-stacked-proof-of-replication-merkletree-poseidon_hasher-8-8-0-sha256_hasher-82a357d2f2ca81dc61bb45f4a762807aedee1b0a53fd6c4e77b46a01bfef7820.vk is ok
2020-07-08T16:34:11.890+0800 INFO build go-paramfetch@v0.0.2-0.20200701152213-3e0f0afdc261/paramfetch.go:138 Parameter file /var/tmp/filecoin-proof-parameters/v27-proof-of-spacetime-fallback-merkletree-poseidon_hasher-8-0-0-0cfb4f178bbb71cf2ecfcd42accce558b27199ab4fb59cb78f2483fe21ef36d9.vk is ok
2020-07-08T16:34:11.890+0800 INFO build go-paramfetch@v0.0.2-0.20200701152213-3e0f0afdc261/paramfetch.go:138 Parameter file /var/tmp/filecoin-proof-parameters/v27-proof-of-spacetime-fallback-merkletree-poseidon_hasher-8-8-2-b62098629d07946e9028127e70295ed996fe3ed25b0f9f88eb610a0ab4385a3c.vk is ok
2020-07-08T16:34:11.890+0800 INFO build go-paramfetch@v0.0.2-0.20200701152213-3e0f0afdc261/paramfetch.go:138 Parameter file /var/tmp/filecoin-proof-parameters/v27-stacked-proof-of-replication-merkletree-poseidon_hasher-8-8-2-sha256_hasher-96f1b4a04c5c51e4759bbf224bbc2ef5a42c7100f16ec0637123f16a845ddfb2.vk is ok
2020-07-08T16:34:11.889+0800 INFO build go-paramfetch@v0.0.2-0.20200701152213-3e0f0afdc261/paramfetch.go:138 Parameter file /var/tmp/filecoin-proof-parameters/v27-proof-of-spacetime-fallback-merkletree-poseidon_hasher-8-0-0-3ea05428c9d11689f23529cde32fd30aabd50f7d2c93657c1d3650bca3e8ea9e.vk is ok
2020-07-08T16:34:11.889+0800 INFO build go-paramfetch@v0.0.2-0.20200701152213-3e0f0afdc261/paramfetch.go:138 Parameter file /var/tmp/filecoin-proof-parameters/v27-stacked-proof-of-replication-merkletree-poseidon_hasher-8-0-0-sha256_hasher-6babf46ce344ae495d558e7770a585b2382d54f225af8ed0397b8be7c3fcd472.vk is ok
2020-07-08T16:34:11.890+0800 INFO build go-paramfetch@v0.0.2-0.20200701152213-3e0f0afdc261/paramfetch.go:138 Parameter file /var/tmp/filecoin-proof-parameters/v27-proof-of-spacetime-fallback-merkletree-poseidon_hasher-8-0-0-5294475db5237a2e83c3e52fd6c2b03859a1831d45ed08c4f35dbf9a803165a9.vk is ok
2020-07-08T16:34:11.890+0800 INFO build go-paramfetch@v0.0.2-0.20200701152213-3e0f0afdc261/paramfetch.go:138 Parameter file /var/tmp/filecoin-proof-parameters/v27-stacked-proof-of-replication-merkletree-poseidon_hasher-8-0-0-sha256_hasher-ecd683648512ab1765faa2a5f14bab48f676e633467f0aa8aad4b55dcb0652bb.vk is ok
2020-07-08T16:34:11.889+0800 INFO build go-paramfetch@v0.0.2-0.20200701152213-3e0f0afdc261/paramfetch.go:138 Parameter file /var/tmp/filecoin-proof-parameters/v27-proof-of-spacetime-fallback-merkletree-poseidon_hasher-8-0-0-0170db1f394b35d995252228ee359194b13199d259380541dc529fb0099096b0.vk is ok
2020-07-08T16:34:11.890+0800 INFO build go-paramfetch@v0.0.2-0.20200701152213-3e0f0afdc261/paramfetch.go:138 Parameter file /var/tmp/filecoin-proof-parameters/v27-stacked-proof-of-replication-merkletree-poseidon_hasher-8-0-0-sha256_hasher-032d3138d22506ec0082ed72b2dcba18df18477904e35bafee82b3793b06832f.vk is ok
2020-07-08T16:34:11.890+0800 INFO build go-paramfetch@v0.0.2-0.20200701152213-3e0f0afdc261/paramfetch.go:138 Parameter file /var/tmp/filecoin-proof-parameters/v27-proof-of-spacetime-fallback-merkletree-poseidon_hasher-8-8-0-559e581f022bb4e4ec6e719e563bf0e026ad6de42e56c18714a2c692b1b88d7e.vk is ok
2020-07-08T16:34:11.891+0800 INFO build go-paramfetch@v0.0.2-0.20200701152213-3e0f0afdc261/paramfetch.go:138 Parameter file /var/tmp/filecoin-proof-parameters/v27-proof-of-spacetime-fallback-merkletree-poseidon_hasher-8-0-0-7d739b8cf60f1b0709eeebee7730e297683552e4b69cab6984ec0285663c5781.vk is ok
2020-07-08T16:34:11.891+0800 INFO build go-paramfetch@v0.0.2-0.20200701152213-3e0f0afdc261/paramfetch.go:138 Parameter file /var/tmp/filecoin-proof-parameters/v27-proof-of-spacetime-fallback-merkletree-poseidon_hasher-8-0-0-50c7368dea9593ed0989e70974d28024efa9d156d585b7eea1be22b2e753f331.vk is ok
2020-07-08T16:34:11.911+0800 INFO build go-paramfetch@v0.0.2-0.20200701152213-3e0f0afdc261/paramfetch.go:138 Parameter file /var/tmp/filecoin-proof-parameters/v27-proof-of-spacetime-fallback-merkletree-poseidon_hasher-8-8-2-2627e4006b67f99cef990c0a47d5426cb7ab0a0ad58fc1061547bf2d28b09def.vk is ok
2020-07-08T16:34:11.911+0800 INFO build go-paramfetch@v0.0.2-0.20200701152213-3e0f0afdc261/paramfetch.go:138 Parameter file /var/tmp/filecoin-proof-parameters/v27-proof-of-spacetime-fallback-merkletree-poseidon_hasher-8-8-0-0377ded656c6f524f1618760bffe4e0a1c51d5a70c4509eedae8a27555733edc.vk is ok
2020-07-08T16:34:12.713+0800 INFO build go-paramfetch@v0.0.2-0.20200701152213-3e0f0afdc261/paramfetch.go:138 Parameter file /var/tmp/filecoin-proof-parameters/v27-proof-of-spacetime-fallback-merkletree-poseidon_hasher-8-8-0-559e581f022bb4e4ec6e719e563bf0e026ad6de42e56c18714a2c692b1b88d7e.params is ok
2020-07-08T16:35:04.519+0800 INFO build go-paramfetch@v0.0.2-0.20200701152213-3e0f0afdc261/paramfetch.go:138 Parameter file /var/tmp/filecoin-proof-parameters/v27-stacked-proof-of-replication-merkletree-poseidon_hasher-8-8-0-sha256_hasher-82a357d2f2ca81dc61bb45f4a762807aedee1b0a53fd6c4e77b46a01bfef7820.params is ok
2020-07-08T16:36:03.370+0800 INFO build go-paramfetch@v0.0.2-0.20200701152213-3e0f0afdc261/paramfetch.go:138 Parameter file /var/tmp/filecoin-proof-parameters/v27-proof-of-spacetime-fallback-merkletree-poseidon_hasher-8-8-0-0377ded656c6f524f1618760bffe4e0a1c51d5a70c4509eedae8a27555733edc.params is ok
2020-07-08T16:36:03.370+0800 INFO build go-paramfetch@v0.0.2-0.20200701152213-3e0f0afdc261/paramfetch.go:162 parameter and key-fetching complete
SectorID: 0
Status: CommitFailed
CommD: 6261666b3463687a6161353766377872767975666a676135666b61653667736d6b6a32376e37343434696b3372626e7a71336477687672357075793761
CommR: 6261666b3465687a616f32737165776c746f716e34756337686a70706676756775656173646579677167637a6c6a746c657568797a766d796771347971
Ticket: 0d58342afa4e75c3b0ee31c39596605fbb8493bb2965a01d5144eb02737dffeb
TicketH: 63333
Seed: fbf49b18192f957b40f3c6a527a743eef11e5d4fbda59b14ae35c529a6840efd
SeedH: 65051
Proof:
Deals: [0]
Retries: 0
--------
Event Log:
0. 2020-07-07 18:27:31 +0800 CST: [event;sealing.SectorStart] {"User":{"ID":0,"SectorType":3,"Pieces":[{"Piece":{"Size":34359738368,"PieceCID":{"/":"bafk4chzaa57f7xrvyufjga5fkae6gsmkj27n7444ik3rbnzq3dwhvr5puy7a"}},"DealInfo":null}]}}
1. 2020-07-07 18:27:31 +0800 CST: [event;sealing.SectorPacked] {"User":{"FillerPieces":null}}
2. 2020-07-07 22:04:04 +0800 CST: [event;sealing.SectorRestart] {"User":{}}
3. 2020-07-08 02:27:08 +0800 CST: [event;sealing.SectorPreCommit1] {"User":{"PreCommit1Out":"eyJyZWdpc3RlcmVkX3Byb29mIjoiU3RhY2tlZERyZzMyR2lCVjEiLCJsYWJlbHMiOnsiU3RhY2tlZERyZzMyR2lCVjEiOnsibGFiZWxzIjpbeyJwYXRoIjoiL21udC9zZGIvLmxvdHVzc3RvcmFnZS9jYWNoZS9zLXQwMTIwNDAyLTAiLCJpZCI6ImxheWVyLTEiLCJzaXplIjoxMDczNzQxODI0LCJyb3dzX3RvX2Rpc2NhcmQiOjd9LHsicGF0aCI6Ii9tbnQvc2RiLy5sb3R1c3N0b3JhZ2UvY2FjaGUvcy10MDEyMDQwMi0wIiwiaWQiOiJsYXllci0yIiwic2l6ZSI6MTA3Mzc0MTgyNCwicm93c190b19kaXNjYXJkIjo3fSx7InBhdGgiOiIvbW50L3NkYi8ubG90dXNzdG9yYWdlL2NhY2hlL3MtdDAxMjA0MDItMCIsImlkIjoibGF5ZXItMyIsInNpemUiOjEwNzM3NDE4MjQsInJvd3NfdG9fZGlzY2FyZCI6N30seyJwYXRoIjoiL21udC9zZGIvLmxvdHVzc3RvcmFnZS9jYWNoZS9zLXQwMTIwNDAyLTAiLCJpZCI6ImxheWVyLTQiLCJzaXplIjoxMDczNzQxODI0LCJyb3dzX3RvX2Rpc2NhcmQiOjd9LHsicGF0aCI6Ii9tbnQvc2RiLy5sb3R1c3N0b3JhZ2UvY2FjaGUvcy10MDEyMDQwMi0wIiwiaWQiOiJsYXllci01Iiwic2l6ZSI6MTA3Mzc0MTgyNCwicm93c190b19kaXNjYXJkIjo3fSx7InBhdGgiOiIvbW50L3NkYi8ubG90dXNzdG9yYWdlL2NhY2hlL3MtdDAxMjA0MDItMCIsImlkIjoibGF5ZXItNiIsInNpemUiOjEwNzM3NDE4MjQsInJvd3NfdG9fZGlzY2FyZCI6N30seyJwYXRoIjoiL21udC9zZGIvLmxvdHVzc3RvcmFnZS9jYWNoZS9zLXQwMTIwNDAyLTAiLCJpZCI6ImxheWVyLTciLCJzaXplIjoxMDczNzQxODI0LCJyb3dzX3RvX2Rpc2NhcmQiOjd9LHsicGF0aCI6Ii9tbnQvc2RiLy5sb3R1c3N0b3JhZ2UvY2FjaGUvcy10MDEyMDQwMi0wIiwiaWQiOiJsYXllci04Iiwic2l6ZSI6MTA3Mzc0MTgyNCwicm93c190b19kaXNjYXJkIjo3fSx7InBhdGgiOiIvbW50L3NkYi8ubG90dXNzdG9yYWdlL2NhY2hlL3MtdDAxMjA0MDItMCIsImlkIjoibGF5ZXItOSIsInNpemUiOjEwNzM3NDE4MjQsInJvd3NfdG9fZGlzY2FyZCI6N30seyJwYXRoIjoiL21udC9zZGIvLmxvdHVzc3RvcmFnZS9jYWNoZS9zLXQwMTIwNDAyLTAiLCJpZCI6ImxheWVyLTEwIiwic2l6ZSI6MTA3Mzc0MTgyNCwicm93c190b19kaXNjYXJkIjo3fSx7InBhdGgiOiIvbW50L3NkYi8ubG90dXNzdG9yYWdlL2NhY2hlL3MtdDAxMjA0MDItMCIsImlkIjoibGF5ZXItMTEiLCJzaXplIjoxMDczNzQxODI0LCJyb3dzX3RvX2Rpc2NhcmQiOjd9XSwiX2giOm51bGx9fSwiY29uZmlnIjp7InBhdGgiOiIvbW50L3NkYi8ubG90dXNzdG9yYWdlL2NhY2hlL3MtdDAxMjA0MDItMCIsImlkIjoidHJlZS1kIiwic2l6ZSI6MjE0NzQ4MzY0Nywicm93c190b19kaXNjYXJkIjo3fSwiY29tbV9kIjpbNywxMjYsOTUsMjIyLDUzLDE5NywxMCwxNDcsMywxNjUsODAsOSwyMjcsNzMsMTM4LDc4LDE5MCwyMjMsMjQzLDE1Niw2NiwxODMsMTYsMTgzLDQ4LDIxNiwyMzYsMTIyLDE5OSwxNzUsMTY2LDYyXX0=","TicketValue":"DVg0KvpOdcOw7jHDlZZgX7uEk7spZaAdUUTrAnN9/+s=","TicketEpoch":63333}}
4. 2020-07-08 03:40:16 +0800 CST: [event;sealing.SectorPreCommit2] {"User":{"Sealed":{"/":"bafk4ehzaiytfpzcmm3ilbyitsg4dkwj4ebaytbbp7bu4gq7lgp77f4kbbq4a"},"Unsealed":{"/":"bafk4chzaa57f7xrvyufjga5fkae6gsmkj27n7444ik3rbnzq3dwhvr5puy7a"}}}
5. 2020-07-08 03:40:17 +0800 CST: [event;sealing.SectorPreCommitted] {"User":{"Message":{"/":"bafy2bzacecrbog7rxi5tlgn7g7lfpedn2cb53sqj47ivkcbefchdxck3kpcsi"}}}
6. 2020-07-08 03:42:56 +0800 CST: [event;sealing.SectorPreCommitLanded] {"User":{"TipSet":"AXGg5AIgqlEFIPXkSXOKDWXAHQwZYuQSqeknYe8tFnnJ1CWhwgUBcaDkAiD4c3WbHga+jOXrtO7opzCbgjZ2QjGqsT4rRzSWNho/OQFxoOQCINlod2mnqRF9jCAr3i3yeYHjMw39BkKPtaC6cD4T+JLI"}}
7. 2020-07-08 03:47:06 +0800 CST: [event;sealing.SectorSeedReady] {"User":{"SeedValue":"+/SbGBkvlXtA88alJ6dD7vEeXU+9pZsUrjXFKaaEDv0=","SeedEpoch":65051}}
8. 2020-07-08 04:58:13 +0800 CST: [event;sealing.SectorComputeProofFailed] {"User":{}}
computing seal proof failed(2): coordinate(s) do not lie on the curve
9. 2020-07-08 04:59:13 +0800 CST: [event;sealing.SectorRetryComputeProof] {"User":{}}
10. 2020-07-08 06:09:29 +0800 CST: [event;sealing.SectorComputeProofFailed] {"User":{}}
computing seal proof failed(2): coordinate(s) do not lie on the curve
11. 2020-07-08 06:10:29 +0800 CST: [event;sealing.SectorRetryComputeProof] {"User":{}}
12. 2020-07-08 07:20:33 +0800 CST: [event;sealing.SectorComputeProofFailed] {"User":{}}
computing seal proof failed(2): coordinate(s) do not lie on the curve
13. 2020-07-08 07:21:33 +0800 CST: [event;sealing.SectorSealPreCommit1Failed] {"User":{}}
consecutive compute fails
14. 2020-07-08 07:22:33 +0800 CST: [event;sealing.SectorRetrySealPreCommit1] {"User":{}}
15. 2020-07-08 11:46:12 +0800 CST: [event;sealing.SectorPreCommit1] {"User":{"PreCommit1Out":"eyJyZWdpc3RlcmVkX3Byb29mIjoiU3RhY2tlZERyZzMyR2lCVjEiLCJsYWJlbHMiOnsiU3RhY2tlZERyZzMyR2lCVjEiOnsibGFiZWxzIjpbeyJwYXRoIjoiL21udC9zZGIvLmxvdHVzc3RvcmFnZS9jYWNoZS9zLXQwMTIwNDAyLTAiLCJpZCI6ImxheWVyLTEiLCJzaXplIjoxMDczNzQxODI0LCJyb3dzX3RvX2Rpc2NhcmQiOjd9LHsicGF0aCI6Ii9tbnQvc2RiLy5sb3R1c3N0b3JhZ2UvY2FjaGUvcy10MDEyMDQwMi0wIiwiaWQiOiJsYXllci0yIiwic2l6ZSI6MTA3Mzc0MTgyNCwicm93c190b19kaXNjYXJkIjo3fSx7InBhdGgiOiIvbW50L3NkYi8ubG90dXNzdG9yYWdlL2NhY2hlL3MtdDAxMjA0MDItMCIsImlkIjoibGF5ZXItMyIsInNpemUiOjEwNzM3NDE4MjQsInJvd3NfdG9fZGlzY2FyZCI6N30seyJwYXRoIjoiL21udC9zZGIvLmxvdHVzc3RvcmFnZS9jYWNoZS9zLXQwMTIwNDAyLTAiLCJpZCI6ImxheWVyLTQiLCJzaXplIjoxMDczNzQxODI0LCJyb3dzX3RvX2Rpc2NhcmQiOjd9LHsicGF0aCI6Ii9tbnQvc2RiLy5sb3R1c3N0b3JhZ2UvY2FjaGUvcy10MDEyMDQwMi0wIiwiaWQiOiJsYXllci01Iiwic2l6ZSI6MTA3Mzc0MTgyNCwicm93c190b19kaXNjYXJkIjo3fSx7InBhdGgiOiIvbW50L3NkYi8ubG90dXNzdG9yYWdlL2NhY2hlL3MtdDAxMjA0MDItMCIsImlkIjoibGF5ZXItNiIsInNpemUiOjEwNzM3NDE4MjQsInJvd3NfdG9fZGlzY2FyZCI6N30seyJwYXRoIjoiL21udC9zZGIvLmxvdHVzc3RvcmFnZS9jYWNoZS9zLXQwMTIwNDAyLTAiLCJpZCI6ImxheWVyLTciLCJzaXplIjoxMDczNzQxODI0LCJyb3dzX3RvX2Rpc2NhcmQiOjd9LHsicGF0aCI6Ii9tbnQvc2RiLy5sb3R1c3N0b3JhZ2UvY2FjaGUvcy10MDEyMDQwMi0wIiwiaWQiOiJsYXllci04Iiwic2l6ZSI6MTA3Mzc0MTgyNCwicm93c190b19kaXNjYXJkIjo3fSx7InBhdGgiOiIvbW50L3NkYi8ubG90dXNzdG9yYWdlL2NhY2hlL3MtdDAxMjA0MDItMCIsImlkIjoibGF5ZXItOSIsInNpemUiOjEwNzM3NDE4MjQsInJvd3NfdG9fZGlzY2FyZCI6N30seyJwYXRoIjoiL21udC9zZGIvLmxvdHVzc3RvcmFnZS9jYWNoZS9zLXQwMTIwNDAyLTAiLCJpZCI6ImxheWVyLTEwIiwic2l6ZSI6MTA3Mzc0MTgyNCwicm93c190b19kaXNjYXJkIjo3fSx7InBhdGgiOiIvbW50L3NkYi8ubG90dXNzdG9yYWdlL2NhY2hlL3MtdDAxMjA0MDItMCIsImlkIjoibGF5ZXItMTEiLCJzaXplIjoxMDczNzQxODI0LCJyb3dzX3RvX2Rpc2NhcmQiOjd9XSwiX2giOm51bGx9fSwiY29uZmlnIjp7InBhdGgiOiIvbW50L3NkYi8ubG90dXNzdG9yYWdlL2NhY2hlL3MtdDAxMjA0MDItMCIsImlkIjoidHJlZS1kIiwic2l6ZSI6MjE0NzQ4MzY0Nywicm93c190b19kaXNjYXJkIjo3fSwiY29tbV9kIjpbNywxMjYsOTUsMjIyLDUzLDE5NywxMCwxNDcsMywxNjUsODAsOSwyMjcsNzMsMTM4LDc4LDE5MCwyMjMsMjQzLDE1Niw2NiwxODMsMTYsMTgzLDQ4LDIxNiwyMzYsMTIyLDE5OSwxNzUsMTY2LDYyXX0=","TicketValue":"DVg0KvpOdcOw7jHDlZZgX7uEk7spZaAdUUTrAnN9/+s=","TicketEpoch":63333}}
16. 2020-07-08 12:57:17 +0800 CST: [event;sealing.SectorPreCommit2] {"User":{"Sealed":{"/":"bafk4ehzao2sqewltoqn4uc7hjppfvugueasdeygqgczljtleuhyzvmygq4yq"},"Unsealed":{"/":"bafk4chzaa57f7xrvyufjga5fkae6gsmkj27n7444ik3rbnzq3dwhvr5puy7a"}}}
17. 2020-07-08 12:57:18 +0800 CST: [event;sealing.SectorPreCommitLanded] {"User":{"TipSet":"AXGg5AIg5Xza7o+dkPsJNXwbvx8fUgkoLQNA0BAlCno7IWzS/i4BcaDkAiDz9VNI0k+dhdsCjHMzFIpGDj4Ha5wpNZIb4ZtRCtDbyw=="}}
18. 2020-07-08 12:57:18 +0800 CST: [event;sealing.SectorSeedReady] {"User":{"SeedValue":"+/SbGBkvlXtA88alJ6dD7vEeXU+9pZsUrjXFKaaEDv0=","SeedEpoch":65051}}
19. 2020-07-08 14:08:37 +0800 CST: [event;sealing.SectorCommitFailed] {"User":{}}
If you run paramfetch
, it will check your params and keys and download new ones if needed. I think it will also log what it does so you should be able to tell whether any bad files were detected. If you want to be extra sure, you could copy the current files and compare them later. Or just record digests of your current files. Or check them into a git repo to accomplish both…
I may know the cause of the error. I have a few suspected objects. Let's find out the cause of the error by looking at our common special operation points. @hsienfu
Do you resolve it? @moonlight233
Do you resolve it? @moonlight233
I have a few suspected objects.Please list your above situation so that I can find out the reason
Event Log:
Caused by: encoding has unexpected information
2020-07-10T05:19:06.966 INFO bellperson::gpu::locks > GPU is available for FFT! 2020-07-10T05:19:07.071 INFO bellperson::gpu::fft > FFT: 1 working device(s) selected. 2020-07-10T05:19:07.071 INFO bellperson::gpu::fft > FFT: Device 0: GeForce RTX 2080 Ti 2020-07-10T05:19:07.071 INFO bellperson::domain > GPU FFT kernel instantiated! Retrying after this step every time
Hsienfu and I have eliminated the hardware problem through comparison. I also tried the software a dozen times with different methods, all of which reported the same error. This should be your bug. @porcuquine
I don't think you can eliminate hardware problems by comparing your hardware, but we can keep investigating the software.
The next step if we are to make progress would be for you to isolate exactly where in the process the corruption occurs. It would also be useful to discover whether it is deterministic. That is, do you end up with the same corrupted contents whenever this happens, or do they vary?
Each error or retrying is after this step, which can determine where in the process the corruption occurs? 2020-07-10T05:19:06.966 INFO bellperson::gpu::locks > GPU is available for FFT! 2020-07-10T05:19:07.071 INFO bellperson::gpu::fft > FFT: 1 working device(s) selected. 2020-07-10T05:19:07.071 INFO bellperson::gpu::fft > FFT: Device 0: GeForce RTX 2080 Ti 2020-07-10T05:19:07.071 INFO bellperson::domain > GPU FFT kernel instantiated! @porcuquine
I thought I had written it here, but I guess it was in Slack. Here is what I wrote there:
I guess the next logical step is to determine exactly when this corruption happens. One way I can think of: write a script to continually hash the files and check against the saved (correct) value. If a change is ever detected, log it. Then by inspecting the logs and comparing times, you can hopefully figure out exactly which step causes the corruption. This may still not be fine-grained enough, but it will give you a lot more information than you have now.
I found two places that might cause errors storage-fsm@v0.0.0-20200625160832-379a4655b044/states_sealing.go:203 precommit message landed on chain: 0 2020-07-10T03:56:15.481+0800 WARN sectors storage-fsm@v0.0.0-20200625160832-379a4655b044/states_sealing.go:236 revert in interactive commit sector step 2020-07-10T03:56:15.962+0800 WARN sectors storage-fsm@v0.0.0-20200625160832-379a4655b044/states_sealing.go:236 revert in interactive commit sector step 2020-07-10T03:56:18.743+0800 WARN sectors storage-fsm@v0.0.0-20200625160832-379a4655b044/states_sealing.go:236 revert in interactive commit sector step 2020-07-10T03:56:40.180+0800 WARN sectors storage-fsm@v0.0.0-20200625160832-379a4655b044/states_sealing.go:236 revert in interactive commit sector step 2020-07-10T03:56:40.310+0800 WARN sectors storage-fsm@v0.0.0-20200625160832-379a4655b044/states_sealing.go:236 revert in interactive commit sector step 2020-07-10T03:58:45.702+0800 INFO sectors storage-fsm@v0.0.0-20200625160832-379a4655b044/states_sealing.go:248 scheduling seal proof computation...
2020-07-10T05:23:05.004 INFO bellperson::multiexp > GPU Multiexp kernel instantiated!
2020-07-10T05:25:25.374+0800 INFO rpc go-jsonrpc@v0.1.1-0.20200602181149-522144ab4e24/client.go:204 rpc output message buffer {"n": 2}
2020-07-10T05:25:25.375+0800 INFO rpc go-jsonrpc@v0.1.1-0.20200602181149-522144ab4e24/client.go:204 rpc output message buffer {"n": 2}
thread 'RUST_BACKTRACE=1
environment variable to display a backtrace # Here!!!!!!!!!!!
2020-07-10T05:31:38.788+0800 INFO sectors storage-fsm@v0.0.0-20200625160832-379a4655b044/states_failed.go:19 ComputeProofFailed(0), waiting 59.211471427s before retrying
2020-07-10T05:32:18.126+0800 INFO dht/RtRefreshManager rtrefresh/rt_refresh_manager.go:265 starting refreshing cpl 0 with key ~� (routing table size was 0)
Event Log:
Caused by: encoding has unexpected information
@porcuquine
I find out the "the element is not part of an r-order subgroup" at https://github.com/zkcrypto/group/blob/master/src/lib.rs:171. Is it invoked with rust-ffi-proofs?
2020-07-13T09:15:30.982 INFO filecoin_proofs::api > generate_piece_commitment:start 2020-07-13T09:15:31.041 INFO filecoin_proofs::api > generate_piece_commitment:finish 2020-07-13T09:15:31.046 INFO filecoin_proofs::api > generate_piece_commitment:start 2020-07-13T09:15:31.094+0800 INFO miner miner/miner.go:304 Time delta between now and our mining base: 6s (nulls: 0) 2020-07-13T09:15:31.107 INFO filecoin_proofs::api > generate_piece_commitment:finish 2020-07-13T09:15:31.114 INFO filcrypto::proofs::api > generate_data_commitment: start 2020-07-13T09:15:31.114 INFO filecoin_proofs::api::seal > compute_comm_d:start 2020-07-13T09:15:31.114 INFO filecoin_proofs::pieces > verifying 8192 pieces 2020-07-13T09:15:31.115 INFO filecoin_proofs::api::seal > compute_comm_d:finish 2020-07-13T09:15:31.115 INFO filcrypto::proofs::api > generate_data_commitment: finish 2020-07-13T09:15:31.115+0800 INFO sectors storage-fsm@v0.0.0-20200707194229-bc5e298e2b4c/sealing.go:240 Creating CC sector 2 2020-07-13T09:15:33.376+0800 INFO sectors storage-fsm@v0.0.0-20200707194229-bc5e298e2b4c/states_sealing.go:21 performing filling up rest of the sector... {"sector": "2"} 2020-07-13T09:15:33.402+0800 ERROR sectors storage-fsm@v0.0.0-20200707194229-bc5e298e2b4c/fsm.go:26 unhandled sector error (2): checkPieces sanity check error: github.com/filecoin-project/storage-fsm.(*Sealing).handlePreCommit1 /home/filtech/go/pkg/mod/github.com/filecoin-project/storage-fsm@v0.0.0-20200707194229-bc5e298e2b4c/states_sealing.go:92
env LOTUS_STORAGE_PATH=/media/filtech/DDD/lotusstorage lotus-storage-miner info Miner: t01152 Sector Size: 32 GiB Byte Power: 32 GiB / 2.952 TiB (1.0587%) Actual Power: 32 Gi / 3.02 Ti (1.0360%) Committed: 32 GiB Proving: 32 GiB Expected block win rate: 179.0208/day (every 8m2s)
Miner Balance: 15006.106409068606536657 PreCommit: 7331.75321792595719542 Locked: 7454.399919416314163392 Available: 219.953271726335177845 Worker Balance: 10239.252589836194784327 Market (Escrow): 0 Market (Locked): 0
Sectors: Total: 3 Proving: 1 PreCommit1: 1 FailedUnrecoverable: 1
On the butterfly, the first sector was successfully sealed, and the second and third sectors failed. The miner has not stopped without adding other parameters. Why is this happening? @porcuquine
32GB 2020-07-07T00:43:11.983 INFO filecoin_proofs::caches > found params in memory cache for STACKED[34359738368]-verifying-key 2020-07-07T00:43:11.983 INFO filecoin_proofs::api::seal > got verifying key (34359738368) while verifying seal 2020-07-07T00:43:11.984 INFO filcrypto::proofs::api > verify_seal: finish 2020-07-07T00:43:11.985+0800 ERROR sectors storage-fsm@v0.0.0-20200625160832-379a4655b044/fsm.go:26 unhandled sector error (0): checkCommit sanity check error: github.com/filecoin-project/storage-fsm.(Sealing).handleCommitFailed /home/filtech/go/pkg/mod/github.com/filecoin-project/storage-fsm@v0.0.0-20200625160832-379a4655b044/states_failed.go:184 - verify seal: github.com/filecoin-project/storage-fsm.(Sealing).checkCommit /home/filtech/go/pkg/mod/github.com/filecoin-project/storage-fsm@v0.0.0-20200625160832-379a4655b044/checks.go:158 - failed to fill whole buffer 512MB 2020-07-15T08:08:08.490 INFO filecoin_proofs::api::seal > snark_proof:finish 2020-07-15T08:08:08.490 INFO filecoin_proofs::api::seal > verify_seal:start 2020-07-15T08:08:08.490 INFO filecoin_proofs::caches > trying parameters memory cache for: STACKED[536870912]-verifying-key 2020-07-15T08:08:08.490 INFO filecoin_proofs::caches > no params in memory cache for STACKED[536870912]-verifying-key 2020-07-15T08:08:08.490 INFO storage_proofs_core::parameter_cache > parameter set identifier for cache: layered_drgporep::PublicParams{ graph: stacked_graph::StackedGraph{expansion_degree: 8 base_graph: drgraph::BucketGraph{size: 16777216; degree: 6; hasher: poseidon_hasher} }, challenges: LayerChallenges { layers: 2, max_count: 2 }, tree: merkletree-poseidon_hasher-8-0-0 } 2020-07-15T08:08:08.490 INFO storage_proofs_core::parameter_cache > ensuring that all ancestor directories for: "/media/filtech/CCC/filecoin-proof-parameters/v27-stacked-proof-of-replication-merkletree-poseidon_hasher-8-0-0-sha256_hasher-6babf46ce344ae495d558e7770a585b2382d54f225af8ed0397b8be7c3fcd472.vk" exist 2020-07-15T08:08:08.490 INFO storage_proofs_core::parameter_cache > checking cache_path: "/media/filtech/CCC/filecoin-proof-parameters/v27-stacked-proof-of-replication-merkletree-poseidon_hasher-8-0-0-sha256_hasher-6babf46ce344ae495d558e7770a585b2382d54f225af8ed0397b8be7c3fcd472.vk" for verifying key 2020-07-15T08:08:08.505 INFO storage_proofs_core::parameter_cache > read verifying key from cache "/media/filtech/CCC/filecoin-proof-parameters/v27-stacked-proof-of-replication-merkletree-poseidon_hasher-8-0-0-sha256_hasher-6babf46ce344ae495d558e7770a585b2382d54f225af8ed0397b8be7c3fcd472.vk" 2020-07-15T08:08:08.505 INFO filecoin_proofs::api::seal > got verifying key (536870912) while verifying seal 2020-07-15T08:08:08.517 INFO filecoin_proofs::api::seal > verify_seal:finish 2020-07-15T08:08:08.517 INFO filecoin_proofs::api::seal > seal_commit_phase2:finish 2020-07-15T08:08:08.517 INFO filcrypto::proofs::api > seal_commit_phase2: finish 2020-07-15T08:08:08.572 INFO filcrypto::proofs::api > verify_seal: start 2020-07-15T08:08:08.572 INFO filecoin_proofs::api::seal > verify_seal:start 2020-07-15T08:08:08.572 INFO filecoin_proofs::caches > trying parameters memory cache for: STACKED[536870912]-verifying-key 2020-07-15T08:08:08.572 INFO filecoin_proofs::caches > found params in memory cache for STACKED[536870912]-verifying-key 2020-07-15T08:08:08.572 INFO filecoin_proofs::api::seal > got verifying key (536870912) while verifying seal 2020-07-15T08:08:08.587 INFO filecoin_proofs::api::seal > verify_seal:finish 2020-07-15T08:08:08.587 INFO filcrypto::proofs::api > verify_seal: finish
I find out the "the element is not part of an r-order subgroup" at https://github.com/zkcrypto/group/blob/master/src/lib.rs:171. Is it invoked with rust-ffi-proofs?
Yes, that looks like the source.
512MB succeeds, 32GB fails, can you help me find out the reason through comparison
There are already four Chinese people who have encountered the same problem, and some of them have happened on Intel. We have formed a group and have been studying how to solve it. Please help us. @porcuquine
On the butterfly, the first sector was successfully sealed, and the second and third sectors failed. The miner has not stopped without adding other parameters. Why is this happening? @porcuquine
I don't understand what you mean by 'butterfly' here.
Butterfly is branch ntwk-butterfly. https://stats.butterfly.fildev.network/d/z6FtI92Zz/chain?orgId=1&refresh=25s&from=now-30m&to=now&kiosk
32GB 2020-07-07T00:43:11.983 INFO filecoin_proofs::caches > found params in memory cache for STACKED[34359738368]-verifying-key 2020-07-07T00:43:11.983 INFO filecoin_proofs::api::seal > got verifying key (34359738368) while verifying seal 2020-07-07T00:43:11.984 INFO filcrypto::proofs::api > verify_seal: finish 2020-07-07T00:43:11.985+0800 ERROR sectors storage-fsm@v0.0.0-20200625160832-379a4655b044/fsm.go:26 unhandled sector error (0): checkCommit sanity check error: github.com/filecoin-project/storage-fsm.(Sealing).handleCommitFailed /home/filtech/go/pkg/mod/github.com/filecoin-project/storage-fsm@v0.0.0-20200625160832-379a4655b044/states_failed.go:184 - verify seal: github.com/filecoin-project/storage-fsm.(Sealing).checkCommit /home/filtech/go/pkg/mod/github.com/filecoin-project/storage-fsm@v0.0.0-20200625160832-379a4655b044/checks.go:158 - failed to fill whole buffer
Well, this seems to be the same corruption as before.
I need you to:
on the master and on the butterfly. I tested more than 20 times 32gb. Only one time was successful, after that one sector succeeded, the following sectors failed, miner did not stop at that time, the environment has not changed.
Just like the logs above, both P1 and P2 are successful, and the failure is C2 or verify. We both use GPU, which may be GPU. This is the smallest range we can use our equipment to detect, you know better than we , Please help us to check the reason based on this information
After each failure, using the pledge sectors will directly report an error. It can only be cleared and applied again after a new miner. It takes several hours to complete the operation each time. Starting from 6.18, I have to test for more than 12 hours every day.
32GB is missing this log. 2020-07-15T08:08:08.490 INFO filecoin_proofs::caches > trying parameters memory cache for: STACKED[536870912]-verifying-key 2020-07-15T08:08:08.490 INFO filecoin_proofs::caches > no params in memory cache for STACKED[536870912]-verifying-key 2020-07-15T08:08:08.490 INFO storage_proofs_core::parameter_cache > parameter set identifier for cache: layered_drgporep::PublicParams{ graph: stacked_graph::StackedGraph{expansion_degree: 8 base_graph: drgraph::BucketGraph{size: 16777216; degree: 6; hasher: poseidon_hasher} }, challenges: LayerChallenges { layers: 2, max_count: 2 }, tree: merkletree-poseidon_hasher-8-0-0 } 2020-07-15T08:08:08.490 INFO storage_proofs_core::parameter_cache > ensuring that all ancestor directories for: "/media/filtech/CCC/filecoin-proof-parameters/v27-stacked-proof-of-replication-merkletree-poseidon_hasher-8-0-0-sha256_hasher-6babf46ce344ae495d558e7770a585b2382d54f225af8ed0397b8be7c3fcd472.vk" exist 2020-07-15T08:08:08.490 INFO storage_proofs_core::parameter_cache > checking cache_path: "/media/filtech/CCC/filecoin-proof-parameters/v27-stacked-proof-of-replication-merkletree-poseidon_hasher-8-0-0-sha256_hasher-6babf46ce344ae495d558e7770a585b2382d54f225af8ed0397b8be7c3fcd472.vk" for verifying key 2020-07-15T08:08:08.505 INFO storage_proofs_core::parameter_cache > read verifying key from cache "/media/filtech/CCC/filecoin-proof-parameters/v27-stacked-proof-of-replication-merkletree-poseidon_hasher-8-0-0-sha256_hasher-6babf46ce344ae495d558e7770a585b2382d54f225af8ed0397b8be7c3fcd472.vk" @porcuquine
Just like the logs above, both P1 and P2 are successful, and the failure is C2 or verify. We both use GPU, which may be GPU. This is the smallest range we can use our equipment to detect, you know better than we , Please help us to check the reason based on this information
Okay, please try without GPU but with correct parameters fetched. Let’s see if we can rule GPU in or out as the problem.
2020-07-15T19:05:25.735 INFO bellperson::gpu::locks > GPU is available for FFT!
2020-07-15T19:05:25.843 INFO bellperson::gpu::fft > FFT: 1 working device(s) selected.
2020-07-15T19:05:25.843 INFO bellperson::gpu::fft > FFT: Device 0: GeForce RTX 2080 Ti
2020-07-15T19:05:25.843 INFO bellperson::domain > GPU FFT kernel instantiated!
thread '
hsienfu is testing with no gpu and i tested with gpu again. This time I got a new clue, the error of slice.what caused the slice error? @porcuquine
Run ./bench sealing --sector-size 32GiB --no-gpu --storage-dir path/to/.lotus-bench
2020-07-15T10:33:52.203+0800 INFO lotus-bench lotus-bench/main.go:75 Starting lotus-bench
2020-07-15T10:33:52.204+0800 INFO build go-paramfetch@v0.0.2-0.20200701152213-3e0f0afdc261/paramfetch.go:138 Parameter file /var/tmp/filecoin-proof-parameters/v27-stacked-proof-of-replication-merkletree-poseidon_hasher-8-8-2-sha256_hasher-96f1b4a04c5c51e4759bbf224bbc2ef5a42c7100f16ec0637123f16a845ddfb2.vk is ok
2020-07-15T10:33:52.205+0800 INFO build go-paramfetch@v0.0.2-0.20200701152213-3e0f0afdc261/paramfetch.go:138 Parameter file /var/tmp/filecoin-proof-parameters/v27-proof-of-spacetime-fallback-merkletree-poseidon_hasher-8-8-2-b62098629d07946e9028127e70295ed996fe3ed25b0f9f88eb610a0ab4385a3c.vk is ok
2020-07-15T10:33:52.205+0800 INFO build go-paramfetch@v0.0.2-0.20200701152213-3e0f0afdc261/paramfetch.go:138 Parameter file /var/tmp/filecoin-proof-parameters/v27-stacked-proof-of-replication-merkletree-poseidon_hasher-8-0-0-sha256_hasher-6babf46ce344ae495d558e7770a585b2382d54f225af8ed0397b8be7c3fcd472.vk is ok
2020-07-15T10:33:52.206+0800 INFO build go-paramfetch@v0.0.2-0.20200701152213-3e0f0afdc261/paramfetch.go:138 Parameter file /var/tmp/filecoin-proof-parameters/v27-proof-of-spacetime-fallback-merkletree-poseidon_hasher-8-0-0-50c7368dea9593ed0989e70974d28024efa9d156d585b7eea1be22b2e753f331.vk is ok
2020-07-15T10:33:52.207+0800 INFO build go-paramfetch@v0.0.2-0.20200701152213-3e0f0afdc261/paramfetch.go:138 Parameter file /var/tmp/filecoin-proof-parameters/v27-proof-of-spacetime-fallback-merkletree-poseidon_hasher-8-0-0-0170db1f394b35d995252228ee359194b13199d259380541dc529fb0099096b0.vk is ok
2020-07-15T10:33:52.206+0800 INFO build go-paramfetch@v0.0.2-0.20200701152213-3e0f0afdc261/paramfetch.go:138 Parameter file /var/tmp/filecoin-proof-parameters/v27-proof-of-spacetime-fallback-merkletree-poseidon_hasher-8-0-0-7d739b8cf60f1b0709eeebee7730e297683552e4b69cab6984ec0285663c5781.vk is ok
2020-07-15T10:33:52.207+0800 INFO build go-paramfetch@v0.0.2-0.20200701152213-3e0f0afdc261/paramfetch.go:138 Parameter file /var/tmp/filecoin-proof-parameters/v27-proof-of-spacetime-fallback-merkletree-poseidon_hasher-8-0-0-5294475db5237a2e83c3e52fd6c2b03859a1831d45ed08c4f35dbf9a803165a9.vk is ok
2020-07-15T10:33:52.206+0800 INFO build go-paramfetch@v0.0.2-0.20200701152213-3e0f0afdc261/paramfetch.go:138 Parameter file /var/tmp/filecoin-proof-parameters/v27-proof-of-spacetime-fallback-merkletree-poseidon_hasher-8-0-0-0cfb4f178bbb71cf2ecfcd42accce558b27199ab4fb59cb78f2483fe21ef36d9.vk is ok
2020-07-15T10:33:52.206+0800 INFO build go-paramfetch@v0.0.2-0.20200701152213-3e0f0afdc261/paramfetch.go:138 Parameter file /var/tmp/filecoin-proof-parameters/v27-proof-of-spacetime-fallback-merkletree-poseidon_hasher-8-0-0-3ea05428c9d11689f23529cde32fd30aabd50f7d2c93657c1d3650bca3e8ea9e.vk is ok
2020-07-15T10:33:52.207+0800 INFO build go-paramfetch@v0.0.2-0.20200701152213-3e0f0afdc261/paramfetch.go:138 Parameter file /var/tmp/filecoin-proof-parameters/v27-proof-of-spacetime-fallback-merkletree-poseidon_hasher-8-8-0-559e581f022bb4e4ec6e719e563bf0e026ad6de42e56c18714a2c692b1b88d7e.vk is ok
2020-07-15T10:33:52.207+0800 INFO build go-paramfetch@v0.0.2-0.20200701152213-3e0f0afdc261/paramfetch.go:138 Parameter file /var/tmp/filecoin-proof-parameters/v27-stacked-proof-of-replication-merkletree-poseidon_hasher-8-8-0-sha256_hasher-82a357d2f2ca81dc61bb45f4a762807aedee1b0a53fd6c4e77b46a01bfef7820.vk is ok
2020-07-15T10:33:52.206+0800 INFO build go-paramfetch@v0.0.2-0.20200701152213-3e0f0afdc261/paramfetch.go:138 Parameter file /var/tmp/filecoin-proof-parameters/v27-stacked-proof-of-replication-merkletree-poseidon_hasher-8-0-0-sha256_hasher-032d3138d22506ec0082ed72b2dcba18df18477904e35bafee82b3793b06832f.vk is ok
2020-07-15T10:33:52.207+0800 INFO build go-paramfetch@v0.0.2-0.20200701152213-3e0f0afdc261/paramfetch.go:138 Parameter file /var/tmp/filecoin-proof-parameters/v27-stacked-proof-of-replication-merkletree-poseidon_hasher-8-0-0-sha256_hasher-ecd683648512ab1765faa2a5f14bab48f676e633467f0aa8aad4b55dcb0652bb.vk is ok
2020-07-15T10:33:52.211+0800 INFO build go-paramfetch@v0.0.2-0.20200701152213-3e0f0afdc261/paramfetch.go:138 Parameter file /var/tmp/filecoin-proof-parameters/v27-proof-of-spacetime-fallback-merkletree-poseidon_hasher-8-8-2-2627e4006b67f99cef990c0a47d5426cb7ab0a0ad58fc1061547bf2d28b09def.vk is ok
2020-07-15T10:33:52.212+0800 INFO build go-paramfetch@v0.0.2-0.20200701152213-3e0f0afdc261/paramfetch.go:138 Parameter file /var/tmp/filecoin-proof-parameters/v27-proof-of-spacetime-fallback-merkletree-poseidon_hasher-8-8-0-0377ded656c6f524f1618760bffe4e0a1c51d5a70c4509eedae8a27555733edc.vk is ok
2020-07-15T10:33:52.432+0800 INFO build go-paramfetch@v0.0.2-0.20200701152213-3e0f0afdc261/paramfetch.go:138 Parameter file /var/tmp/filecoin-proof-parameters/v27-proof-of-spacetime-fallback-merkletree-poseidon_hasher-8-8-0-559e581f022bb4e4ec6e719e563bf0e026ad6de42e56c18714a2c692b1b88d7e.params is ok
2020-07-15T10:34:44.309+0800 INFO build go-paramfetch@v0.0.2-0.20200701152213-3e0f0afdc261/paramfetch.go:138 Parameter file /var/tmp/filecoin-proof-parameters/v27-stacked-proof-of-replication-merkletree-poseidon_hasher-8-8-0-sha256_hasher-82a357d2f2ca81dc61bb45f4a762807aedee1b0a53fd6c4e77b46a01bfef7820.params is ok
2020-07-15T10:34:58.735+0800 INFO build go-paramfetch@v0.0.2-0.20200701152213-3e0f0afdc261/paramfetch.go:138 Parameter file /var/tmp/filecoin-proof-parameters/v27-proof-of-spacetime-fallback-merkletree-poseidon_hasher-8-8-0-0377ded656c6f524f1618760bffe4e0a1c51d5a70c4509eedae8a27555733edc.params is ok
2020-07-15T10:34:58.735+0800 INFO build go-paramfetch@v0.0.2-0.20200701152213-3e0f0afdc261/paramfetch.go:162 parameter and key-fetching complete
2020-07-15T10:34:58.735+0800 INFO lotus-bench lotus-bench/main.go:484 [1] Writing piece into sector...
2020-07-15T10:34:58.984 INFO filecoin_proofs::api > generate_piece_commitment:start
2020-07-15T10:34:59.171 INFO filecoin_proofs::api > generate_piece_commitment:finish
2020-07-15T10:34:59.185 INFO filecoin_proofs::api > generate_piece_commitment:start
....
2020-07-15T20:27:37.607 INFO bellperson::gpu::locks > GPU is available for Multiexp!
2020-07-15T20:27:37.608 WARN bellperson::multiexp > Cannot instantiate GPU Multiexp kernel! Error: GPUError: No working GPUs found!
2020-07-15T20:28:17.495 INFO filcrypto::proofs::api > seal_commit_phase2: finish
2020-07-15T20:28:30.925+0800 WARN lotus-bench lotus-bench/main.go:91 failed to run seals:
main.glob..func3
/usr/local/services/lotus/cmd/lotus-bench/main.go:249
- coordinate(s) do not lie on the curve
github.com/filecoin-project/filecoin-ffi.SealCommitPhase2
/usr/local/services/lotus/extern/filecoin-ffi/proofs.go:382
github.com/filecoin-project/sector-storage/ffiwrapper.(*Sealer).SealCommit2
/root/go/pkg/mod/github.com/filecoin-project/sector-storage@v0.0.0-20200630180318-4c1968f62a8f/ffiwrapper/sealer_cgo.go:500
@porcuquine
hsienfu is testing with no gpu and i tested with gpu again. This time I got a new clue, the error of slice.what caused the slice error? @porcuquine
These tests seem to show the problem is not GPU-related (assuming you had fresh, uncorrupted parameters/keys before you started).
The slice error seems to be the same kind of thing you've been seeing before — trying to read data that's not long enough. The simplest explanation (not saying it is this), would be that your parameter files have been truncated. I am still waiting for more detailed about the timing and nature of the corruption you are experiencing. Without that, I don't think I'll be able to make useful guesses.
I wrote the things I think you need to do and collect a few messages up.
On further consideration, the slice error looks weird. These logs give very little information about what's actually happening, though.
My best current guess is that something weird may be happening with the mmapping of groth parameters. I talked to @cryptonemo about it briefly and am hoping he can investigate a little. Maybe he will have some ideas or discover something.
I still think the information I've requested will be helpful. It's pretty likely that without more information about the details of failure (exactly when, in what way, with what consistency, and which files are corrupted), we won't be able to narrow this down enough.
This type of error all occurs on a single machine There are Intel cpu and AMD cpu The error has nothing to do with Gpu Intel only reports this kind of error with cpu, only a small amount, unlike amd almost always report an error Proof has nothing to do with ubuntu version 512MB is all successful, 32GB and 64GB are almost failed I tried more than 30 times with amd, only one time successfully sealed the 32gb sector, but the subsequent sectors failed, when the miner did not stop, the environmental parameters have not changed @porcuquine
This is what I'm looking for: https://github.com/filecoin-project/rust-fil-proofs/issues/1185#issuecomment-658482719
- Where can I find hash the file which becomes corrupted Is it in unseal?
You told me before that either your groth params or your verifying key is becoming corrupted and has to be fetched again. That is the file I want you to hash in order to find out exactly when this happening. Use a CLI digest program like md5sum
or sha1sum
, etc.
The V27 file does not appear to be damaged, because it can be used next time, and no mismatch appears
If none of the files are corrupted, then this line of inquiry is wrong, but I'm pretty sure that's not what we concluded previously (either here or on Slack, I can't keep track).
Please confirm that the verifying key is never corrupted. If it is then please try to discover exactly when and how.
- Where can I find hash the file which becomes corrupted Is it in unseal?
You told me before that either your groth params or your verifying key is becoming corrupted and has to be fetched again. That is the file I want you to hash in order to find out exactly when this happening. Use a CLI digest program like
md5sum
orsha1sum
, etc. Initially, there was a response to parameter mismatch, but with our later tests, mismatch no longer appearsThe V27 file does not appear to be damaged, because it can be used next time, and no mismatch appears
If none of the files are corrupted, then this line of inquiry is wrong, but I'm pretty sure that's not what we concluded previously (either here or on Slack, I can't keep track).
Please confirm that the verifying key is never corrupted. If it is then please try to discover exactly when and how. Later, vk never crashed, and can be reused by the new miner
All single machines will start rrefeshing this warning after lotus dameon is started, which will interrupt the download of the proof parameters. Although the download will be successful, it proves that the parameters are cut off The reason is that the commit stage is a zero-knowledge file generated by the V27 proof file after being hashed, and then verified Zero-knowledge proof only takes a piece of document. The success we took was exactly the time that we got the complete and uninterrupted document In most cases, the interrupted file is obtained, so that it can be explained This is our guess, please give us guidance
Or the zero-knowledge proof that the core error report actually generates is an error or the verification error of the zero-knowledge proof
I'm sorry. I am having a very hard time understanding your sentences. This is probably a language issue, and I know it's not your fault.
What file is 'interrupted'?
If you do not believe your groth params or verifying key are corrupted, then you shouldn't need to download anything.
If something is being corrupted, that is what I am trying to understand more about.
If this is happening and you are able to get the correct files once, you should just make a local copy so you don't have to keep downloading them again. This is especially true if you think the download process may be failing in some way and causing a problem. Of course you need to make sure the 'good copy' you have really is good before relying on it repeatedly though.
I'm sorry. I am having a very hard time understanding your sentences. This is probably a language issue, and I know it's not your fault.
What file is 'interrupted'?
If you do not believe your groth params or verifying key are corrupted, then you shouldn't need to download anything.
If something is being corrupted, that is what I am trying to understand more about.
If this is happening and you are able to get the correct files once, you should just make a local copy so you don't have to keep downloading them again. This is especially true if you think the download process may be failing in some way and causing a problem. Of course you need to make sure the 'good copy' you have really is good before relying on it repeatedly though. When starting the miner download V27 download, it was interrupted by WARN dht/RtRefreshManager rtrefresh/rt_refresh_manager.go: 191 failed when refreshing routing tab and then automatically restored
Describe the problem
computing seal proof failed(2): the element is not part of an r-order subgroup
Sectors status
The output of
./lotus-storage-miner sectors status --log <sectorId>
for the failed sector(s).Version
The output of
./lotus --version
.