nix-community / nix-index

Quickly locate nix packages with specific files [maintainers=@bennofs @figsoda @raitobezarius]
Other
851 stars 48 forks source link

Indexer performance improvements #152

Closed enolan closed 1 year ago

enolan commented 3 years ago

Headline: 198 seconds to 73 seconds with these changes. :tada:

I'll go through the changes in order.

Baseline:

________________________________________________________
Executed in  198.49 secs   fish           external 
   usr time  190.70 secs  208.00 micros  190.70 secs 
   sys time   11.35 secs   39.00 micros   11.35 secs 

I first change I made was to run the nix-env invocations in parallel. This gets us about 10s:

________________________________________________________
Executed in  182.38 secs   fish           external 
   usr time  195.08 secs  219.00 micros  195.07 secs 
   sys time   11.55 secs   40.00 micros   11.55 secs 

Upgrading zstd gets us around 5s:

________________________________________________________
Executed in  177.98 secs   fish           external 
   usr time  190.41 secs    0.00 micros  190.41 secs 
   sys time   12.60 secs  777.00 micros   12.59 secs 

Turning on parallel compression doesn't do anything, though:

________________________________________________________
Executed in  177.71 secs   fish           external 
   usr time  186.58 secs    0.00 micros  186.58 secs 
   sys time   12.20 secs  800.00 micros   12.20 secs 

But if you watch it in htop you see that it doesn't actually use more than one core. AFAICT, the "ultra" compression levels are incompatible with multithreading. Switching to compression level 19 from 22 shaves more than 100s:

________________________________________________________
Executed in   73.91 secs   fish           external 
   usr time  185.89 secs   18.76 millis  185.87 secs 
   sys time   14.82 secs    2.05 millis   14.82 secs 

It increases the size of the database from 31 megs to 36, which I'd say is an acceptable tradeoff, and makes the indexing process no longer CPU bound on my machine. As a sanity check, we can try level 19 without multithreading:

________________________________________________________
Executed in  141.96 secs   fish           external 
   usr time  154.38 secs  660.00 micros  154.38 secs 
   sys time   12.00 secs  114.00 micros   12.00 secs 

Most of the speedup is from the parallelism.

This was all done against v0.1.2, since I can't build master due to #151. Feedback very welcome, I'm still somewhat new to Rust :)

domenkozar commented 3 years ago

@bennofs

ncfavier commented 1 year ago

This PR does seem to improve the time nix-index takes:

# before
real    9m13.026s
user    7m36.327s
sys 0m28.164s

# after
real    7m53.152s
user    9m7.771s
sys 0m25.164s

@enolan how are you getting 73 seconds? o_O I'm on a reasonably recent CPU (AMD Ryzen 5 4650U).