near / near-one-project-tracking

A repository for tracking work items that NEAR One is working on.
0 stars 0 forks source link

[ProjectTracking]: ZK WASM Stage 2, WASM Interpreter #6

Closed aborg-dev closed 1 month ago

aborg-dev commented 1 year ago

Goals

Our long-term goal is to use ZK Proofs to scale NEAR protocol to more shards while maintaining high security guarantees.

The goal of Stage 2 is to support proofs for a WASM interpreter (compiled to WASM) that supports WASM MVP features.

Supported workloads:

We will also set an ambitious performance goal - we want the cycle amplification factor from WASM to ZK ASM (number of ZK ASM cycles to implement a WASM opcode) to be less than 5 on average. Right now it is around 30.

Links to external documentation and discussions

The documentation for the project can be found in near/wasmtime.

All meeting notes and design docs are indexed on HackMD.

Estimated effort

The concrete work items for this project are tracked by the milestone.

It is expected that the runtime team will be working on this project. The people actively working on this project are: @akashin , @MCJOHN974 , @mooori , and @nagisa .

Stage 2 is expected to be completed by April 2023.

Assumptions

Assumptions and risks are enumerated in the roadmap.

Pre-requisites

Pre-requisite work will be completed as part of Stage 1

Out of scope

MCJOHN974 commented 1 year ago

Just some notes.

(number of ZK ASM cycles to implement a WASM opcode) to be less than 5 on average

1) I think 5 is good goal for stage 3 or 4, while for stage 2 I would suggest 12 -- realistic, but still almost 3 times better than current one. 2) Should @mooori be added to list of actively involved persons? 3) I suggest to add work on 32-bit zk-processor to milestone, IMO, it is one of the most effective ways to boost performance. Also, if we planning to create 32-bit zk-processor, we should do it ASAP, because all work with masking for 32-bit we do now, will be useless after 32-bit processor will be created. So, I suggest to set 32-bit zk-processor as second priority after supporting instruction set for wasm interpreter.

aborg-dev commented 1 year ago

Just some notes.

(number of ZK ASM cycles to implement a WASM opcode) to be less than 5 on average

1. I think 5 is good goal for stage 3 or 4, while for stage 2 I would suggest 12 -- realistic, but still almost 3 times better than current one.

I agree that 5 is likely too optimistic. I think we should identify the methodology on how to measure this, as you suggested in the last meeting, and that would help us to come up with a better number/set of numbers to target. I filed an issue for this - https://github.com/near/wasmtime/issues/137

2. Should @mooori be added to list of actively involved persons?

Yes! My bad, thank you for flagging this.

3. I suggest to add work on 32-bit zk-processor to milestone, IMO, it is one of the most effective ways to boost performance. Also, if we planning to create 32-bit zk-processor,  we should do it ASAP, because all work with masking for 32-bit we do now, will be useless after 32-bit processor will be created. So, I suggest to set 32-bit zk-processor as second priority after supporting instruction set for wasm interpreter.

I agree that dedicated support of 32-bit ops in ZK ASM will likely bring a good performance boost. I still think that this should be treated as an implementation detail and I'm not ready to commit to it before we identify other things on our plate and other optimizations that we can work on and weigh them against each other. I suggest we make this call slightly later when we study the results of SHA256 benchmark and do some basic analysis of the "hotness" of each instruction.

MCJOHN974 commented 1 year ago

I agree that dedicated support of 32-bit ops in ZK ASM will likely bring a good performance boost. I still think that this should be treated as an implementation detail and I'm not ready to commit to it before we identify other things on our plate and other optimizations that we can work on and weigh them against each other. I suggest we make this call slightly later when we study the results of SHA256 benchmark and do some basic analysis of the "hotness" of each instruction.

My main point here is, that, while we don't have 32-bit type in zkasm, we have to implement some sort of masking. Once we will have 32-bit supported by zkasm, we will instantly move to trash all this masking. So, I just suggest to try make as small effort on masking as possible

About how it will increase performance, totally agree, see results of more benchmarks will be better

aborg-dev commented 10 months ago

We now have a milestone tracking all the work that we want to do. Given that there are a lot of items there, we added some structure by introducing a top-level tracking items that encompass smaller items:

  1. Test infrastructure
  2. Benchmarking infrastructure
  3. WASM Interpreter in ZKASM
  4. Supporting WASM opcodes for Stage 2
  5. Performance optimizations for Stage 2

Items 1-3 will be worked on as individual projects for the next 4-8 weeks given their well-defined scope and limited context. For items 4-5, I think we will all work on solving them collectively - this will ensure that everyone has the opportunity to contribute to the actual backend implementation and understand WASM and ZKASM on a deeper level. This will also help us to dogfood the end-to-end tooling that we've built for items 1-3.

We have started this stage this week and will aim to complete it by the end of April 2024.

aborg-dev commented 10 months ago

Here is an update on the work that happened in the last two weeks.

We finally fixed SHA256 benchmark and are working now on including code changes into the main branch. The resulting SHA256 ZKASM program takes 27k cycles on 32 bytes of input and 210k cycles on 448 bytes of input (there are some constant startup overheads). This is within 7x of a dedicated circuit for SHA256.

We have completed a round of design reviews:

and will start implementing these designs as the next step.

We also had a productive discussion about the design and features of the ZKASM processor specialized for WASM and agreed to start working on it at the beginning of March.

aborg-dev commented 9 months ago

We've had a lot of progress over the last two weeks:

The current timeline is:

bowenwang1996 commented 9 months ago

Optimize the benchmarks till we reach the performance target in April 2024

@akashin what is the performance target?

aborg-dev commented 9 months ago

Optimize the benchmarks till we reach the performance target in April 2024

@akashin what is the performance target?

I think in Stage 2 we should aim for:

This seems realistic and will help us to keep our work grounded. For the Stage 3 we can start introducing NEAR-specific performance targets.

aborg-dev commented 9 months ago

Actually, I take back my last statement, we can start targeting some NEAR-specific workloads now, but we first need to figure out what is worth pursuing and at what performance it will become valuable for NEAR protocol. We should discuss this in more detail.

aborg-dev commented 9 months ago

Our progress over the last 3 weeks:

Test Infra (by @MCJOHN974)

Benchmarking Infra (by @mooori)

WASM Interpreter (by @akashin)

Misc

bowenwang1996 commented 9 months ago

We will be working with Polygon in March to migrate to PIL2

Sounds cool! Where can I read more about this?

This is 1000x slowdown compared to block production

What do you mean by "block production"? Light clients do not verify state transitions, so I am not sure whether it is an apple to apple comparison

aborg-dev commented 9 months ago

We will be working with Polygon in March to migrate to PIL2

Sounds cool! Where can I read more about this?

This is 1000x slowdown compared to block production

What do you mean by "block production"? Light clients do not verify state transitions, so I am not sure whether it is an apple to apple comparison

I will share the design doc for PIL2 migration when it is ready, for now our discussions can be found in https://hackmd.io/MTmmOMOxShaMqMi83bxzRQ. For now, I've shared PIL2 presentation (over email) that we had during the Workshop last week.

Regarding the light client, I meant that we need ~1000 machines producing light client proofs to keep up with the block production rate (every 1 second). I agree that this is not apples to apples comparison, just an attempt to understand what would it take to make ZK practical in this context.

mooori commented 8 months ago

Progress over the last two weeks:

Testing infra (by @MCJOH974)

TODO:

Benchmarking infra (by @mooori)

Above changes enable going WAT -> instrumented zkASM -> trace of executed instructions as described in 240. Started working on a analyze-zkasm subcommand to facilitate and better document that workflow.

Tooling and codegen (by @akashin)

Planning (by @akashin)


@akashin I hope we captured your progress correctly. In case something is missing or wrong, please leave a message or (if possible) update this post.

mooori commented 8 months ago

Timeline update regarding benchmarking infra:

TBD if afterwards work on benchmarking infra continues or if something else gets prioritized. Possible next steps for benchmarking infra could be:

Other tasks that might be prioritized, for now, over further benchmarking infra work: