Public parameter caching for faster proving and verification

jpeg07 commented 1 year ago

@weissjeffm we'd like to have you work on parameter caching for Lurk. @samuelburnham should be able to help you orient, generally, and @porcuquine can add specific details were needed.

This should get integrated with zaps (and by extension the current website).

porcuquine commented 1 year ago

The short version is that public parameters are expensive to generate. For Groth16, 'real' parameters cannot actually be generated without trusted setup. We're also not concerned with Groth16 for now. Although the best general mechanism should probably also work for Groth16, let's focus on making this work for Nova.

The most obvious problem with regenerating the parameters on every use is that it makes verification potentially very slow, whereas it's important that verification be as fast as possible.

Ideally, we would have a one-time process of generating public parameters and caching them on disk. On startup, a program needing to verify would load these parameters from disk, then keep them cached for reuse across repeated verifications. I think @samuelburnham has a decent handle on what is required and how it should work, so please reach out to see if he can help orient you.

Once this is done, the http verifier provided by zaps (which will need to be updated to use Nova), should be fast. More importantly, we should then be able to allow zaps to efficiently accept only submissions with accompanying proofs. (cc: @differentialderek )

weissjeffm commented 1 year ago

Is the main savings not having to re-derive the parameters (I presume from a seed), or is it also about not having to even read them from disk more than once, and keeping them in memory? The latter requires a long-running process, something I would guess maybe zaps already does but I'm not sure. I've only seen the fcomm examples that only verify one proof at a time (afaict).

Is there any thoughts on how to persist the parameters? Would a simple database work, like maybe a sqlite db, or even simpler like a json file? If they're really expensive to generate maybe flinging them even farther afield would make sense - like maybe 1) check if in memory, if not 2) check a predefined local db file 3) if not check a central server, if not 4) derive it again.

porcuquine commented 1 year ago

Is the main savings not having to re-derive the parameters (I presume from a seed), or is it also about not having to even read them from disk more than once, and keeping them in memory?

Both.

The latter requires a long-running process, something I would guess maybe zaps already does but I'm not sure. I've only seen the fcomm examples that only verify one proof at a time (afaict).

Right. If the process is only verifying once, there won't be a benefit from caching in memory. If more than once (for example, a web server or other server process), then were is.

Is there any thoughts on how to persist the parameters?

I would start with just using a file and simple serde serialization. It probably makes sense to take advantage of learning we did when implementing Filecoin proofs, regarding how to choose the location for such caches: https://github.com/filecoin-project/rust-fil-proofs/blob/master/storage-proofs-core/src/settings.rs#L61-L69. You could use that for the 'caches' used more generally — though note that ultimately some of these (like commitment files, which save important private data) will eventually need to better choices.

Would a simple database work, like maybe a sqlite db, or even simpler like a json file?

Yes, as above, json could work.

If they're really expensive to generate maybe flinging them even farther afield would make sense - like maybe 1) check if in memory, if not 2) check a predefined local db file 3) if not check a central server, if not 4) derive it again.

I think in this case, it's fine to derive locally when needed — as long as we avoid doing so more often than needed.

weissjeffm commented 1 year ago

We're talking about (for nova at least), serializing nova::PublicParams? We don't own that struct so won't we have to do this (and recurse into each field and generic type)? https://serde.rs/remote-derive.html (which seems rather ugly at first glance but maybe it won't be too horrible)

edit: maybe we should try to contribute serialization upstream instead.

porcuquine commented 1 year ago

Is it not already serializable? If not, then yes, we can get any changes upstreamed without significant difficulty. If you need to do so, you could work on our fork (making sure it's up to date), and then eventually make a PR to Nova itself. @samuelburnham might have some advice, since he's already dealt with serialization up the dependency chain.

weissjeffm commented 1 year ago

It doesn't look like it's serializable. As far as I can tell, the component parts of the PublicParams struct come from pasta_curves and that crate doesn't refer to serde at all. So it looks like there will be several levels of upstream changes required with this approach.

porcuquine commented 1 year ago

That work is well in progress, but the upstreaming is not complete, which is why #173 has not landed yet. You can see here that a fork is being used for now: https://github.com/lurk-lang/lurk-rs/pull/173/files#diff-2e9d962a08321605940b5a657135052fbcef87b5e360662bb527c96d9a615542R73

We'd like to move forward with the parameter serialization in advance of fully resolving the dependency stack, which means you'll want to build off #173 for now. @samuelburnham can provide more context and detail.

samuelburnham commented 1 year ago

@weissjeffm The newly released pasta_curves v0.5 has serde support for the Ep and Eq types, which means that Nova can serialize SNARKs and PublicParams once my serde fork is merged. Fcomm currently caches proofs with a FileStore trait that you could use: https://github.com/lurk-lang/lurk-rs/blob/15eb522c87d535c3316978812c5b63f582f92691/fcomm/src/lib.rs#L346

Caching JSON as a first step sounds good, and once that's working it may be a good idea to instead serialize to LDON, which is a Lurk-specific superset of JSON. This would allow us to optimize fcomm caching with one common data format and custom serialization, but it's not a relevant consideration until that branch is finalized/merged.

weissjeffm commented 1 year ago

@samuelburnham is your serde branch of Nova far enough along that I can start working with it?

weissjeffm commented 1 year ago

@samuelburnham @porcuquine it looks like public_params is a function of only a ReductionCount which is just an enum of 3 items (One, Five, Ten). Do I understand correctly that this is a pure function, or at least, I can act as if it is? In which case, I can store the PublicParams (both in memory and on disk) keyed off the ReductionCount?

huitseeker commented 1 year ago

This is generalized in #381, and has become too general of an issue: @porcuquine I think we can close in favor of #311 #386 #387

argumentcomputer / lurk-beta

Public parameter caching for faster proving and verification #189