nietaki / hoplon

decentralized package security audit network of trust
Apache License 2.0
32 stars 1 forks source link

No need to diff files #23

Open hauleth opened 6 years ago

hauleth commented 6 years ago

Git internally is just a Merkle-tree so there is no need for diff as you can just compute Git-compatible tree and compare it with the one stored in repo. In theory it can end with differences (as SHA-1 is broken), but it would allow to speedup first check and reject obvious invalid repos.

nietaki commented 6 years ago

Thanks for the suggestion!

I like this idea, especially since it would remove hoplon's dependency on diff. I had a quick look at how Merkle-trees are used in Git and it looks interesting and viable. There's a handful of things holding me back from jumping on it though.

I'll have a closer look at this idea in the future, but as I see it now, I don't think the benefits are strong enough. I'm happy to be proved wrong though :)

hauleth commented 6 years ago

@nietaki what I am suggesting is to have two step verification:

  1. Compute Merkle-Tree of files in repository and in dependency tree. While it requires some code, it isn't that much as Erlang has built-in SHA-1 in crypto module.
  2. If everything is valid then run full comparison.

Alternatively you can compute 2 Merkle-trees using two different functions and this would greatly reduce attack surface (ex. SHA-1 and compare it with Git and then SHA-256 both in Git and deps to be sure, SHA-256 isn't currently broken and even if it was I do not thin that it would be feasible to find one collision in both functions at the same time).