apache / tvm

Open deep learning compiler stack for cpu, gpu and specialized accelerators
https://tvm.apache.org/
Apache License 2.0
11.76k stars 3.47k forks source link

[RFC][Cryptography] Modernize the cryptography implementation #4403

Closed niconiconi closed 4 years ago

niconiconi commented 4 years ago

Significant progress has been made during recent years due to the development of blockchain. However, these cryptography protocols are hard to implement correctly and even more hard to parallelize. A lot of them can be efficiently parallelized, and I will list some of them later. It shares a data-parallel nature with deep learning tasks.

Existing new papers are doing the experiment in single-thread mode, and claiming that it's parallelizable, for example Ligero, Libra. I am an author of Libra (not the facebook libra), and I tried to implement a parallelized one. There are several reasons for not presenting the parallelized program in our paper:

Example applications

Merkle tree

Merkle tree enables Alice to commit their secret data and Bob can verify the membership of that secret data. Alice cannot modify the secret data after the commitment is made.

It builds a binary tree, where leaves are the data itself, and for every parent node, it hashes the child node and stores the hash value. Alice will output the root node as the commitment. Any data modify will change the root completely, so it will preserve the data integrity. The structure enables Bob to ask Alice what's the value is at a specific leaf node, then Alice can provide the whole tree path from that leaf node to the root. Bob can verify the integrity of data by doing hash. And checking the hash result is identical to the root.

Such structure is parallelizable, we can parallel evaluate hashes for each node. And we can finish the commitment generation in only O(log n) rounds.

Polynomial commitment

Polynomial commitment enables Alice to commit to a polynomial F(x). After she committed the polynomial, she cannot change the polynomial, otherwise, the change will be detected in verification phase.

The verification phase(or opening phase), will open the commitment to a specific point z, and it will return the polynomial value F(z) as well as a proof for the validity of the value (say the value is actually derived from the polynomial that is committed before).

This can be implemented by a combination of NTTs (FFT in finite fields) and some other minor parts. So it can be parallelized efficiently like FFT.

Zero-knowledge proofs

It's a bit hard to explain, but it can be constructed from polynomial commitments. Industrial level implementations actually do parallelization but it's great if we can make a tool for such tasks.

Major Todos (for now)

Implement basic primitives

This implementation will include implementing all primitives that I mentioned in the previous section and we do have a C++ implementation as a reference. I could DIY it and submit the code later. Most likely, I will use the Python interface.

A clear developer guide for Rust programmers

It's better to have a Rust tutorial because many security companies prefer to use rust.

niconiconi commented 4 years ago

Some story When I was writing Libra and it’s followup work, @merrymercy come and asks “why are you implementing it using a handwritten AVX2 code? It’s already the year 9012! Try tvm instead.”

masahi commented 4 years ago

Interesting, cc @nhynes regarding rust interface.

nhynes commented 4 years ago

rust interface

The rust interface is currently only that of a TVM runtime, but that's sufficient since the kernels are already memory safe. For maximum safety (but slightly less flexibility from dynamic linking), one would use the [non-bindings] runtime in rust/runtime. At least until TVM exposes a C FFI to the compiler internals, developers will necessarily have a polyglot workflow.

I, personally, would love to see a VTA-compatible implementation of ed25519 :)

liangfu commented 4 years ago

Shall the basic primitives implemented in the runtime? Or it could be designed with either hybrid script or ir_builder, I mean the primitives could be generated.

niconiconi commented 4 years ago

Shall the basic primitives implemented in the runtime? Or it could be designed with either hybrid script or ir_builder, I mean the primitives could be generated.

I would prefer "intrinsic"

junrushao commented 4 years ago

Hi Tiancheng,

It would be super excited and great impact if TVM is enabled to express crypto protocols and primitives. While I think the RFC does a perfect job in enumerating all the tasks from the crypto side, would you mind talking about something on the compiler side? For example, it would be great for the community to know how many of those primitives can be expressed as intrinsic (links will be helpful), and whether others could be abstracted as op.

tqchen commented 4 years ago

closing for now due to inactive status, feel free to reopen