proofHash is underspecified

fulldecent commented 5 years ago

The specification is given:

A proofHash can be generated by prepending the submitted data..…

This leaves the prepending of data optional. Instead this should be required:

A proofHash is generated by prepending the submitted data..…

Or at a minimum:

A proofHash should be generated by prepending the submitted data..…

Recommendation: add specification to achieve the favorable defense against impersonation / spoofing attacks

References:

https://github.com/erasureprotocol/erasure-protocol/blob/4a3d98ce023a264a9f3c7ba62ef77a9207bba5fe/README.md

thegostep commented 5 years ago

@jparyani I think we should take this opportunity to define a clear spec around formatting and salting.

Current thoughts on proofhash specification:

sha256 hash of following json object:

{
    "version": "1.0.0",
    "creator": "0x1234...89",
    "salt": "0x12345",
    "format": [ .csv, .txt, ... ],
    "data": "my data"
}

Where

version represents the ID of the erasure standard
creator is the ethereum address of the party submitting this data
salt is random string with sufficient entropy
format is machine readable format for decoding the data
data is stringified submission data

thegostep commented 5 years ago

@fulldecent This is our first dip into specification of standards for erasure. Would be great to get your thoughts on where to specify. Separate Erasure Improvement Proposal repo?

fulldecent commented 5 years ago

For the issue at hand, perhaps

A proofHash should be generated by including ... in the submitted data.

This is vague (extensible) enough that we don't need to specify it right now. But it is prescriptive enough that it identifies the problems we are trying to solve.

That solves the issue and allows this project to proceed.

I did not see any use cases where the information to be hashed is machine readable. Therefore, even a hashed Word .docx file can meet all requirements.

If you will have machine readable files then there is already a mature project to talk about how to hash JSON objects where you want to validate parts of it while possibly keeping other parts secret. That is 0xcert Conventions (I advise 0xcert). Another option is possible if zero-knowledge proofs is needed. That is part of EY Nightfall (Ernst & Young is my client).

Basically I'm saying you have a lot of options and you're in good company.

As for Erasure-IP. I'm not sure what the scope of that project would be but I am heavily involved with the equivalents for Ethereum, Aion and 0xcert and can bring it home.

erasureprotocol / erasure-protocol

proofHash is underspecified #265