Docmaps-Project / docmaps

Extensible protocol for document history metadata exchange, to enable trustworthy, rapid, open science, by and for preprint science communities.
MIT License
15 stars 1 forks source link

[PROTOCOL]: Summarization and proving #66

Open ships opened 1 year ago

ships commented 1 year ago

Protocol semantics improvement

Description

It is possible for computed artifacts from input data to be proved in zero knowledge using polynomial circuits as shown in https://github.com/zkonduit/ezkl . Adopting this kind of technology would let a docmap make an assertion like: "By running GPT-2 with this prompt, and this input, we got this summary." (GPT-2 may be too large to be used in fact in this way, depending on what computation requirements there are in the verifier. Needs investigation. Smaller summarizers probably do exist.)

Use case

Importantly this means that large docmaps which are already verified as signed can then be verified as having some probability of a certain qualitative factor, such as sentiment analysis of the reviews, and those claims can be efficiently ingested by machines for analysis. This is a place where Docmaps could really shine.

Besides basic summarization of content, we could make arguments for analyses of an entire Docmap; could do provable batch analysis of text content for search purposes (i.e. expose an index that provides provable arguments for why a result was included in a search result, though NOT arguments for why a result was NOT included... also bears more thought).

Proposed solution

Needs major investigation. The obvious place to start would be to actually use EZKL directly in a library that exposes WASM calls for the browser when ingesting a docmap. EZKL would have been previously run in compiled rust by the generator of the docmap.