I was excited to check the project sources after the second YT video, especially because many people commented on the inefficiency of the cache strategy.
It looks like you're using plain strings in both cache and in comparisons, which is not optimal. With this issue I suggest changing String to an enum. Thankfully, serde works well with enums:
https://serde.rs/enum-representations.html
For the reference, I compared the two wall-clock times and cache sizes of different implementations.
Disk space indexed: 273 GB.
OS: Fedora 38.
Disk: SK Hynix NVMe.
Previous implementation:
26.5 seconds from cold launch,
12 seconds warm,
68.8 MB of cache;
repr(u8) implementation:
21 seconds cold,
11 seconds warm,
68.2 MB of cache.
It seems that the performance of an indexer is bottlenecked by the filesystem. What I don't understand though, is why the warm start takes so long? There seems to be the cache re-evaluation somehow.
P.S. It also would be great to use std::fs::FileType instead of a custom enum, but the custom enum seems to work just fine.
I was excited to check the project sources after the second YT video, especially because many people commented on the inefficiency of the cache strategy.
I found multiple improvement points in terms of memory optimization. One of them is here: https://github.com/conaticus/FileExplorer/blob/4b60d734941f1cd7d0a6e68291053a91ba019e7a/src-tauri/src/main.rs#L20
When I checked the usages, I quickly discovered that: https://github.com/conaticus/FileExplorer/blob/4b60d734941f1cd7d0a6e68291053a91ba019e7a/src-tauri/src/filesystem/mod.rs#L6-L7 https://github.com/conaticus/FileExplorer/blob/4b60d734941f1cd7d0a6e68291053a91ba019e7a/src-tauri/src/search.rs#L95
It looks like you're using plain strings in both cache and in comparisons, which is not optimal. With this issue I suggest changing
String
to anenum
. Thankfully,serde
works well with enums: https://serde.rs/enum-representations.htmlNow, to efficiently serialize/deserialize, you need a https://github.com/dtolnay/serde-repr crate:
For the reference, I compared the two wall-clock times and cache sizes of different implementations. Disk space indexed: 273 GB. OS: Fedora 38. Disk: SK Hynix NVMe.
repr(u8)
implementation:It seems that the performance of an indexer is bottlenecked by the filesystem. What I don't understand though, is why the warm start takes so long? There seems to be the cache re-evaluation somehow.
P.S. It also would be great to use
std::fs::FileType
instead of a custom enum, but the custom enum seems to work just fine.