0xPolygonZero / plonky2

Apache License 2.0
758 stars 281 forks source link

Documentation in plonky2 crate #1446

Open Gorzorg opened 8 months ago

Gorzorg commented 8 months ago

The plonky2 crate is not documented properly. Even though the whole thing is open source, anyone who wants to use it must figure out what most of the API does by code digging.

Notable examples include:

As far as I know, there is a lack of adequate tutorials (the PolymerLabs tutorial is super useful for the basics, but it is inadequate if one wants to use the crate in a project) or explanations about the code, which makes it hard for newcomers to understand what is going on in plonky2, and what functions should be used/preferred/avoided.

I can offer to help write the missing documentation, but given my lack of experience I doubt i could be of great help, without someone to ask what each thing does.

If you want me to help please let me know.

Nashtare commented 8 months ago

Thank you for reaching out! It's true that our main focus recently hasn't been too much around documentation of the plonky2 crate. I'll dedicate some time to it within the next 10 days to help clarify as much as possible the API for external parties. Because documenting properly the entire codebase would probably be a longer term effort, if you could point me to which parts (apart from CircuitBuilder and Gate) documentation would be useful for external contributors, this would be great!

Gorzorg commented 8 months ago

Thanks for the quick answer! In this first answer i will focus on gates.

What are gates for?

Part of the problem is that it is not obvious to me what gates are for. All of the traits required for their implementation are public, so I guess you want to leave external users with the possibility to implement their own.

Many people will have repeated units of logic in their circuit, and a natural thing to think (maybe this is a misconception I have) is that one could use Gate for library building with plonky2, i.e. to implement some circuit logic in a standardized way.

Another approach to library building would be to just implement new traits on CircuitBuilder, whose functions implement some standardized logic by using already available functionality as building blocks.

What is not clear to me is when to follow one or the other approach, because no general guidelines are available. I suggest a good place to put this kind of guideline could be on the documentation of the Gate trait itself.

num_constraints

this one was a somewhat easy catch, but it was not immediately clear to me that, if you are implementing a gate X, and X uses another gate Y, you should not count Y's constraints towards the number of X's constraints.

While this could seem obvious in hindsight, no definition for num_constraints is provided, and this could lead to implementation errors/frustration.

eval_ functions

There are many variants, and getting an intuition of the differencts takes some time. After hours of code digging, I have some idea of what the _circuit suffix means, and about what the _unfiltered/_filtered suffixes mean, but I cannot say my intuition is precise knowledge. Moreover, I could not really understand if this is something the user should even worry about.

I have many more questions, but that is more curiosity than something practically useful.

Gorzorg commented 8 months ago

Going through the many CircuitBuilder methods, and realizing that "I would like nice docs on all of them" is a huge request, I think I have at the moment two main issues:

Target types

One has Target, ExtensionTarget, ExtensionAlgebraTarget, and the CircuitBuilder's API provides variants for most operations that involve each of those target types. It is not clear to a newcomer like me why/when one should use one over the other, as most methods describe the operation they perform, which stays the same over the variants (e.g. builder.mul_...), and as all those methods have public visibility.

I don't really have a clear idea how to solve this issue. If the answer is very generalizable, maybe one could leave a note to the users in the documentations of the various target types, like "this is meant for internal use, or if you know what you are doing. In most cases, use this other type instead.", or like "you might want to use this target type when you have a situation like example code".

No apparent logical grouping

Since most (all?) methods implemented for CircuitBuilder are directly implemented, and with pub visibility, the user has little hints for what is used where.

Maybe a way to mitigate the confusion generated by this is to group method implementations behind traits that group different functionalities, but this is unfortunately a breaking change for downstream users of plonky2, so I don't know how to feel about it.

I have no clear idea on how to tackle this, but I think having all the available methods listed alphabetically is somewhat discouraging for new users.

Gorzorg commented 8 months ago

Oh god I just realized I skipped over the "(apart from CircuitBuilder and Gate)" parenthesis in your message -.- sorry about that.

Oh well, now at least you know exactly what bothers me about that. I would say the biggest confusion I have so far, besides gates and circuit builder methods, is about plonky2::plonk::circuit_data::CircuitConfig.

The user has the ability to configure every parameter in the struct, but it is not apparent what the implications on performance and security are. Some guidelines in the documentation would be handy.

I am not asking for anything super technical, and even something along the lines of "increasing this parameter allows for shorter proof but increases proving time" would be very useful, because it would help non-expert users in writing benchmarks to empirically understand what works better for them, and what should not be played too much with (i.e. parameters that can reduce security).

Nashtare commented 8 months ago

Thank you for your input, that's really valuable, and will help narrow down the immediate focus for stronger documentation.

Oh god I just realized I skipped over the "(apart from CircuitBuilder and Gate)" parenthesis in your message -.- sorry about that.

No worries, at least you clearly explained what wasn't clear, which can not be obvious for maintainers who have been heavily involved in the library for some time.

I would nonetheless point out, as a global advice for anyone aiming at building on top of plonky2 (or more generally any zk proving system implementation), that building zk-applications / tailored circuits is hard, tedious, error-prone, and should be left to cryptographers or users familiar enough with the cryptographic notions involved, as the consequences of a badly designed circuit can be disastrous.

Gorzorg commented 8 months ago

Thank you for the word of warning. Even though I am not a cryptographer, I do have a mathematical background. I can more or less follow the arguments in the writeup linked in the repo's docs, which alone solved many of my initial doubts. That said, you are absolutely right in saying that anyone in my position can easily do gross mistakes.

However, assuming too many things about how the math maps to the actual code and filling in the gaps with imagination is not a good strategy. This is also a big part in my motivation to properly look around the repo before writing anything with it.

Also, I took too long to edit my previous message, sorry again. In case you didn't read it, I added a couple of sentences about CircuitConfig.

Nashtare commented 8 months ago

@Gorzorg I've added some documentation in the linked PR which I hope will clear some things out. It's not covering the entire list you gave, nor is as deep as you may have hoped for some of its components, but I intend on expanding it with follow-up PRs whenever I have some time.

Gorzorg commented 8 months ago

I just finished reading the changes. Thank you! While you are right in saying one may wish for more on some components, this is already super useful. Thanks again :)

Nashtare commented 8 months ago

We'll try putting some more effort on documentation in the near future, feel free to specify any items that you think would deserve more scrutiny!

Gorzorg commented 6 months ago

So, lately I have been busy making a circuit that employs cyclic recursion. While I eventually managed to make it work, I had to understand how to perform the bootstrapping procedure (I am not implying that now I know what I am doing!) one has to make for getting the right circuit data for cyclic recursion.

Even after making it work, I don't really understand why it works exactly, and I don't understand why that procedure cannot be automated. Maybe some documentation could avoid other users similar questions, and clarify my current doubts.

Nashtare commented 6 months ago

Thanks for the notice @Gorzorg, I'll try getting some time to wrap-up more explicit doc around this area in the coming days!