The hash is a convenience, it's not meant to be a ironclad guarantee that the file has not changed ever. Right now calculating the hash on a 4GB file on a macbook (internal drive) takes like 3 seconds.
However if the file is on some external storage, or a slower computer, you might be waiting 30 seconds for just the hash calculation.
So I think it could be a good idea to calculate the hash based on the first 100MB + middle 100MB + last 100MB + filesize (or something like that).
It makes sure the file is the same, and was not cut off.
(We could even do only first 100MB + filesize).
Especially when hash is calculated in realtime upon file opening, 30 seconds may be way too long!
For now we could introduce a shorthash or something, and if shorthash is missing is either detection or video file, we revert to hash.
For non-behave mp4 files, we only support shorthash since nobody did any inference on them yet.
The hash is a convenience, it's not meant to be a ironclad guarantee that the file has not changed ever. Right now calculating the hash on a 4GB file on a macbook (internal drive) takes like 3 seconds.
However if the file is on some external storage, or a slower computer, you might be waiting 30 seconds for just the hash calculation.
So I think it could be a good idea to calculate the hash based on the first 100MB + middle 100MB + last 100MB + filesize (or something like that). It makes sure the file is the same, and was not cut off. (We could even do only first 100MB + filesize).
Especially when hash is calculated in realtime upon file opening, 30 seconds may be way too long!
For now we could introduce a shorthash or something, and if shorthash is missing is either detection or video file, we revert to hash.
For non-behave mp4 files, we only support shorthash since nobody did any inference on them yet.