filecoin-project / rust-fil-proofs

Proofs for Filecoin in Rust
Other
492 stars 314 forks source link

[bug] paramcache OOMs #480

Closed jbenet closed 5 years ago

jbenet commented 5 years ago

Description

OOM error

> go run ./build/*.go deps
...
Feb 14 23:42:00.515 INFO Actually generating groth params., target: params, place: storage-proofs/src/parameter_cache.rs:70 storage_pro
ofs::parameter_cache, root: storage-proofs
memory allocation of 544244544 bytes failedCommand './proofs/bin/paramcache' failed: signal: aborted (core dumped)
exit status 1

arch

> uname -a
Linux erebor 4.4.0-36-generic #55-Ubuntu SMP Thu Aug 11 18:01:55 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

Acceptance criteria

Two approaches, likely want both in time.

Risks + pitfalls

Where to begin

porcuquine commented 5 years ago
  • (2) another approach is to not have tools like go-filecoin depend on running this, and to instead generate params for the devnets and deliver those params as a static release (over ipfs, github, or gx) that go-filecoin can just download.

This is the solution being developed for #344. This must be the eventual solution in any case, because everyone will need to use securely-generated parameters.

porcuquine commented 5 years ago

I believe a solution to #344 is now deployed and in use from go-filecoin. @sidke Can you confirm?

@jbenet Can you close this issue if the parameter distribution solution (2) is sufficient? We inherit the root problem from bellman — and go-filecoin node parameter generation is not part of the long-term plan and will be phased out.

jbenet commented 5 years ago

Hey @porcuquine -- I tried updating my install, it looks to be downloading params, but then it still tries to generate some params ("Actually generating groth params")

> git rev-parse HEAD
aee9d1a9952b0f6611dbd945ebaaf03463ecfb10
> FILECOIN_USE_PRECOMPILED_RUST_PROOFS=true go run ./build/*.go deps
...
go get -u github.com/json-iterator/go
go get -u github.com/prometheus/client_golang/prometheus
go get -u github.com/prometheus/client_golang/prometheus/promhttp
go get -u github.com/jstemmer/go-junit-report
go get -u github.com/pmezard/go-difflib/difflib
./scripts/install-rust-fil-proofs.sh
using precompiled rust-fil-proofs
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  5741  100  5741    0     0  15163      0 --:--:-- --:--:-- --:--:-- 15147
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
./scripts/install-bls-signatures.sh
using local bls-signatures
~/go/src/github.com/filecoin-project/go-filecoin/bls-signatures/bls-signatures ~/go/src/github.com/filecoin-project/go-filecoin
cargo 1.32.0-nightly (5e85ba14a 2018-12-02)
    Updating crates.io index
    Updating git repository `https://github.com/dignifiedquire/pairing`
    Finished release [optimized] target(s) in 0.23s
~/go/src/github.com/filecoin-project/go-filecoin
./proofs/bin/paramfetch fetch --all --json=./proofs/misc/parameters.json
fetching parameters
fetching 'v9-zigzag-proof-of-replication-f8b6b5b4f1015da3984944b4aef229b63ce950f65c7f41055a995718a452204d'...
parameter file '/tmp/filecoin-proof-parameters/v9-zigzag-proof-of-replication-f8b6b5b4f1015da3984944b4aef229b63ce950f65c7f41055a995718a452204d' already exists
ok
fetching 'v9-zigzag-proof-of-replication-52431242c129794fe51d373ae29953f2ff52abd94c78756e318ce45f3e4946d8'...
parameter file '/tmp/filecoin-proof-parameters/v9-zigzag-proof-of-replication-52431242c129794fe51d373ae29953f2ff52abd94c78756e318ce45f3e4946d8' already exists
ok
./proofs/bin/paramcache
Feb 25 18:01:02.711 INFO parameter set identifier for cache: layered_drgporep::PublicParams{ drg_porep_identifier: drgporep::PublicParams{graph: zigzag_graph::ZigZagGraph{expansion_degree: 8 base_graph: drgraph::BucketGraph{size: 32; degree: 5} }; sloth_iter: 0}, challenges: Tapered { layers: 4, count: 2, taper: 0.3333333333333333, taper_layers: 2 } }, target: params, place: storage-proofs/src/parameter_cache.rs:54 storage_proofs::parameter_cache, root: storage-proofs
Feb 25 18:01:02.711 INFO checking cache_path: "/tmp/filecoin-proof-parameters/v9-zigzag-proof-of-replication-f8b6b5b4f1015da3984944b4aef229b63ce950f65c7f41055a995718a452204d", target: params, place: storage-proofs/src/parameter_cache.rs:83 storage_proofs::parameter_cache, root: storage-proofs
Feb 25 18:01:02.712 INFO reading groth params from cache: "/tmp/filecoin-proof-parameters/v9-zigzag-proof-of-replication-f8b6b5b4f1015da3984944b4aef229b63ce950f65c7f41055a995718a452204d", target: params, place: storage-proofs/src/parameter_cache.rs:126 storage_proofs::parameter_cache, root: storage-proofs
Feb 25 18:01:10.287 INFO groth_parameter_bytes: 770902584, target: stats, place: storage-proofs/src/parameter_cache.rs:131 storage_proofs::parameter_cache, root: storage-proofs
Feb 25 18:01:10.316 INFO parameter set identifier for cache: layered_drgporep::PublicParams{ drg_porep_identifier: drgporep::PublicParams{graph: zigzag_graph::ZigZagGraph{expansion_degree: 8 base_graph: drgraph::BucketGraph{size: 8388608; degree: 5} }; sloth_iter: 0}, challenges: Tapered { layers: 4, count: 2, taper: 0.3333333333333333, taper_layers: 2 } }, target: params, place: storage-proofs/src/parameter_cache.rs:54 storage_proofs::parameter_cache, root: storage-proofs
Feb 25 18:01:10.316 INFO checking cache_path: "/tmp/filecoin-proof-parameters/v9-zigzag-proof-of-replication-52431242c129794fe51d373ae29953f2ff52abd94c78756e318ce45f3e4946d8", target: params, place: storage-proofs/src/parameter_cache.rs:83 storage_proofs::parameter_cache, root: storage-proofs
Feb 25 18:01:10.317 INFO reading groth params from cache: "/tmp/filecoin-proof-parameters/v9-zigzag-proof-of-replication-52431242c129794fe51d373ae29953f2ff52abd94c78756e318ce45f3e4946d8", target: params, place: storage-proofs/src/parameter_cache.rs:126 storage_proofs::parameter_cache, root: storage-proofs
Feb 25 18:01:10.317 INFO groth_parameter_bytes: 0, target: stats, place: storage-proofs/src/parameter_cache.rs:131 storage_proofs::parameter_cache, root: storage-proofs
Feb 25 18:01:10.317 INFO Actually generating groth params., target: params, place: storage-proofs/src/parameter_cache.rs:70 storage_proofs::parameter_cache, root: storage-proofs
memory allocation of 544244544 bytes failedCommand './proofs/bin/paramcache' failed: signal: aborted (core dumped)
exit status 1

Is this a different problem?

porcuquine commented 5 years ago

@jbenet It looks like you have a 0-byte groth parameter file, which probably resulted from interrupted groth parameter generation. This is file is being detected by the paramfetch program, which looks like it is skipping fetching because the file exists.

If I am right about this, I consider it a bug in paramfetch. We should be checking not only for the existence of the parameter files but also that they match the expected checksum (which a 0-byte file will not).

@sidke Can you confirm my interpretation of events or clarify what might actually be happening?

If my interpretation is correct, let's either file a new issue for the more specific issue — or just create a bugfix PR addressing the underlying problem before closing this issue.

porcuquine commented 5 years ago

@jbenet The short-term solution for you would be to delete 0-byte groth parameter files, which should trigger actual fetching.

sidke commented 5 years ago

@porcuquine confirming your interpretation, going to add in validation for all local parameters after fetching

dignifiedquire commented 5 years ago

@sidke @porcuquine can we close this?