resurrecting-open-source-projects / dcfldd

Enhanced version of dd for forensics and security
GNU General Public License v2.0
90 stars 19 forks source link

Hashing md5 vs sha256 and behavior upon failure reading from source #8

Closed slrslr closed 2 years ago

slrslr commented 2 years ago

Hello,

i have read that someone is going to use following command:

sudo dcfldd if=/dev/sdb of=usb.img bs=3M hash=sha256 hashlog=sha256.txt; sha256sum usb.img

i read that by hashing may still not achieve 1:1 copy due to errors reading from source with bad sectors. Does it mean that dcfldd stop operation on read error? I have not found any "ddrescue" like switch to retry (-r) reading sectors that failed reading.

f="log/debug.log";time md5sum "$f";time sha256sum "$f" shows:

real    0m0,008s
real    0m0,004s

so i assume that sha256 would be like 2x faster so i will use it just by defining hash=sha256 hashlog=sha256.txt; where hashlog parameter has no benefit of r€suming the interrupted process it is just to record it for convenience?

davidpolverari commented 2 years ago

Hello,

Hi,

i have read that someone is going to use following command:

sudo dcfldd if=/dev/sdb of=usb.img bs=3M hash=sha256 hashlog=sha256.txt; sha256sum usb.img

i read that by hashing may still not achieve 1:1 copy due to errors reading from source with bad sectors. Does it mean that dcfldd stop operation on read error? I have not found any "ddrescue" like switch to retry (-r) reading sectors that failed reading.

Both dd and dcfldd(as a fork of the former) will stop on read errors, unless you specify conv=noerror. By default, both will try to read the bad sectors multiple times before failing.

f="log/debug.log";time md5sum "$f";time sha256sum "$f" shows:

real  0m0,008s
real  0m0,004s

so i assume that sha256 would be like 2x faster

You're making an assumption based on anecdotal evidence.

so i will use it just by defining hash=sha256 hashlog=sha256.txt; where hashlog parameter has no benefit of r€suming the interrupted process it is just to record it for convenience?

As stated on the dcfldd man page, the hash parameter specifies the hash algorithms to be used, and its usage will perform the calculation in parallel with the disk reading. The hashlog parameter will save the resulting final hashes to the specified files.

The point of those options is only to speed up the hashing process; instead of hashing the read data that at the end of the acquisition process, the hashing is done in parallel with it.

Regards,

David.