behave-app / behave

MIT License
0 stars 0 forks source link

New idea for calculating the hash #36

Open reinhrst opened 6 months ago

reinhrst commented 6 months ago

The hash is a convenience, it's not meant to be a ironclad guarantee that the file has not changed ever. Right now calculating the hash on a 4GB file on a macbook (internal drive) takes like 3 seconds.

However if the file is on some external storage, or a slower computer, you might be waiting 30 seconds for just the hash calculation.

So I think it could be a good idea to calculate the hash based on the first 100MB + middle 100MB + last 100MB + filesize (or something like that). It makes sure the file is the same, and was not cut off. (We could even do only first 100MB + filesize).

Especially when hash is calculated in realtime upon file opening, 30 seconds may be way too long!

For now we could introduce a shorthash or something, and if shorthash is missing is either detection or video file, we revert to hash.

For non-behave mp4 files, we only support shorthash since nobody did any inference on them yet.