Damselfly is a server-based Photograph Management app. The goal of Damselfly is to index an extremely large collection of images, and allow easy search and retrieval of those images, using metadata such as the IPTC keyword tags, as well as the folder and file names. Damselfly includes support for object/face detection.
GNU General Public License v3.0
1.45k
stars
76
forks
source link
Improve performance of hashing and reduce memory #489
π I had a look at the use of hashing here for detecting duplicate images. Currently, this will allocate an array of bytes per pixel-row of each image, which can a significant amount of memory usage.
Instead, we can rely on Span to remove the allocations all together. Benchmarking a 3024 x 4032 image (the size my iPhone currently takes, so seems representative):
Before
Method
Mean
Error
StdDev
Gen0
Allocated
Skia_GetHash
23.08 ms
0.169 ms
0.158 ms
7750.0000
46.61 MB
ImageSharp_GetHash
22.93 ms
0.160 ms
0.150 ms
7750.0000
46.61 MB
After
Method
Mean
Error
StdDev
Allocated
Skia_GetHash
20.19 ms
0.162 ms
0.152 ms
1.21 KB
ImageSharp_GetHash
20.01 ms
0.047 ms
0.042 ms
1.29 KB
This effectively makes the memory usage 1.2 kilobytes regardless of the image size, down from 46 megabytes.
A smaller optimization is to place the hash in a stack buffer before converting it to hex. That just saves one small array allocation for the hash itself. This uses Convert.ToHexString since it can natively operate on a ReadOnlySpan<byte> and also uses upper-case lettering, but if you prefer I can create an overload of your extension method that works off of span as well.
Additionally, this fixes a tiny issue where IncrementalHash is not being disposed. This does not leak memory, but results in finalizers getting run during garbage collection.
π I had a look at the use of hashing here for detecting duplicate images. Currently, this will allocate an array of bytes per pixel-row of each image, which can a significant amount of memory usage.
Instead, we can rely on
Span
to remove the allocations all together. Benchmarking a 3024 x 4032 image (the size my iPhone currently takes, so seems representative):Before
After
This effectively makes the memory usage 1.2 kilobytes regardless of the image size, down from 46 megabytes.
A smaller optimization is to place the hash in a stack buffer before converting it to hex. That just saves one small array allocation for the hash itself. This uses
Convert.ToHexString
since it can natively operate on aReadOnlySpan<byte>
and also uses upper-case lettering, but if you prefer I can create an overload of your extension method that works off of span as well.Additionally, this fixes a tiny issue where
IncrementalHash
is not being disposed. This does not leak memory, but results in finalizers getting run during garbage collection.