Open wpyoga opened 2 years ago
I should do some proper profiling here, but what you're asking for only benefits a small fraction of potential queries. Yes, a hash table could benefit queries like 'pacman' or '/usr/bin/pacman', but you're back to a full scan for anything involving globbing or regex. It does, admittedly, also help with things like #27.
An embedded DB like SQLite might work here to provide a more flexible query language with indexing capabilities. That would be generally useful, but also presents a pretty substantial rewrite of the existing code. I guess that's sort of true of any storage format change...
A single invocation of
pkgfile -b
takes around 230 ms on my laptop (Ryzen 5 4650U, 40GB DDR4, 512GB NVMe SSD).After playing around with different binary names to search, with binary names at the start of the data files, and at the end of the data files, the invocation time doesn't differ by much. Therefore I'm guessing that most of the time is spent on loading the cache files.
I would like to propose a faster storage format, maybe using a hash index, to bring down the search time.
This is my installation:
These are the pkgfile cache files: