Open GoogleCodeExporter opened 9 years ago
One possible security impact scenario is the following:
Alice and Bob shares a repository. Alice uploads evil.exe, a malicious file.
Alice has used a vulnerability in the md5 checksum and designed the file so
that its checksum is identical to the checksum of dostuff.exe, a well-known
useful program.
Bob uploads his dostuff.exe. However, since it has the same checksum as the
existing evil.exe, it is not actually uploaded. Boar notices (wrongly) that it
already has this file in the repo and uses that copy instead.
Some time later, Bob downloads his dostuff.exe file again. However, instead he
receives the evil.exe that Alice uploaded earlier. When he executes the
program, bad things will happen.
Original comment by ekb...@gmail.com
on 2 Sep 2011 at 8:33
Fixed in changeset 5518090482c9. All files now have a corresponding sha256
checksum that ensures that no collisions can go undetected.
Original comment by ekb...@gmail.com
on 25 Sep 2011 at 9:17
Reopening the issue. As it turns out, the implemented solution is too slow. A
verification on a repository will take about twice as long with md5 collision
detection enabled (due to the verification of the sha256 database). I had hoped
to mitigate this slowdown by using python multiprocessing features, but while
that works well on Linux, I have not succeeded in making it work on windows.
Due to md5 collision detection being a somewhat niche feature, I'm going to
disable that feature for the next release as to not make boar slower for the
current boar user base.
Original comment by ekb...@gmail.com
on 8 Nov 2011 at 11:45
Issue 80 has been merged into this issue.
Original comment by ekb...@gmail.com
on 12 Aug 2012 at 7:29
One way to handle this is to store the first 8 bytes of the file as well and
check against that as well as the md5, this makes it nearly impossible to have
a collision even on purpose.
Original comment by cyberempires@gmail.com
on 21 Sep 2012 at 6:12
In response to comment 5: Do you have a reference for your statement? I've
always assumed that md5 simply is inherently unsafe. If that can be mitigated
with a simple check of the first part of the contents, that would certainly
make things easier.
Original comment by ekb...@gmail.com
on 22 Sep 2012 at 8:15
Astronomical or not -- I'm slightly paranoid about it. Does it have to be
SHA256 to detect md5 collisions or might something fast like the SpookyHash
SnapRaid uses be an option as well?
Original comment by mlo...@web.de
on 25 Dec 2013 at 1:16
Or maybe this:
https://github.com/SaberParker/xxHash-Python
https://code.google.com/p/xxhash/
Original comment by mlo...@web.de
on 25 Dec 2013 at 1:25
Original issue reported on code.google.com by
ekb...@gmail.com
on 23 Mar 2011 at 8:34