Closed niconiconi closed 4 years ago
Some story When I was writing Libra and it’s followup work, @merrymercy come and asks “why are you implementing it using a handwritten AVX2 code? It’s already the year 9012! Try tvm instead.”
Interesting, cc @nhynes regarding rust interface.
rust interface
The rust interface is currently only that of a TVM runtime, but that's sufficient since the kernels are already memory safe. For maximum safety (but slightly less flexibility from dynamic linking), one would use the [non-bindings] runtime in rust/runtime
. At least until TVM exposes a C FFI to the compiler internals, developers will necessarily have a polyglot workflow.
I, personally, would love to see a VTA-compatible implementation of ed25519 :)
Shall the basic primitives
implemented in the runtime? Or it could be designed with either hybrid script or ir_builder, I mean the primitives could be generated.
Shall the
basic primitives
implemented in the runtime? Or it could be designed with either hybrid script or ir_builder, I mean the primitives could be generated.
I would prefer "intrinsic"
Hi Tiancheng,
It would be super excited and great impact if TVM is enabled to express crypto protocols and primitives. While I think the RFC does a perfect job in enumerating all the tasks from the crypto side, would you mind talking about something on the compiler side? For example, it would be great for the community to know how many of those primitives can be expressed as intrinsic (links will be helpful), and whether others could be abstracted as op.
closing for now due to inactive status, feel free to reopen
Significant progress has been made during recent years due to the development of blockchain. However, these cryptography protocols are hard to implement correctly and even more hard to parallelize. A lot of them can be efficiently parallelized, and I will list some of them later. It shares a data-parallel nature with deep learning tasks.
Existing new papers are doing the experiment in single-thread mode, and claiming that it's parallelizable, for example Ligero, Libra. I am an author of Libra (not the facebook libra), and I tried to implement a parallelized one. There are several reasons for not presenting the parallelized program in our paper:
Example applications
Merkle tree
Merkle tree enables Alice to commit their secret data and Bob can verify the membership of that secret data. Alice cannot modify the secret data after the commitment is made.
It builds a binary tree, where leaves are the data itself, and for every parent node, it hashes the child node and stores the hash value. Alice will output the root node as the commitment. Any data modify will change the root completely, so it will preserve the data integrity. The structure enables Bob to ask Alice what's the value is at a specific leaf node, then Alice can provide the whole tree path from that leaf node to the root. Bob can verify the integrity of data by doing hash. And checking the hash result is identical to the root.
Such structure is parallelizable, we can parallel evaluate hashes for each node. And we can finish the commitment generation in only O(log n) rounds.
Polynomial commitment
Polynomial commitment enables Alice to commit to a polynomial F(x). After she committed the polynomial, she cannot change the polynomial, otherwise, the change will be detected in verification phase.
The verification phase(or opening phase), will open the commitment to a specific point z, and it will return the polynomial value F(z) as well as a proof for the validity of the value (say the value is actually derived from the polynomial that is committed before).
This can be implemented by a combination of NTTs (FFT in finite fields) and some other minor parts. So it can be parallelized efficiently like FFT.
Zero-knowledge proofs
It's a bit hard to explain, but it can be constructed from polynomial commitments. Industrial level implementations actually do parallelization but it's great if we can make a tool for such tasks.
Major Todos (for now)
Implement basic primitives
Implement a demo project (Proof of Concept)
I could help to implement our zero-knowledge proofs protocol using TVM, and hopefully, we can take advantage of the data parallelism. In my own implementation, I tried to hand-code the AVX2 codes and some assembly codes to make it faster which look very old-school.
This implementation will include implementing all primitives that I mentioned in the previous section and we do have a C++ implementation as a reference. I could DIY it and submit the code later. Most likely, I will use the Python interface.
A clear developer guide for Rust programmers
It's better to have a Rust tutorial because many security companies prefer to use rust.